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A SYNTHETIC APPROACH TO FACTOR ANALYSIS * 


KARL J. HOLZINGER 
UNIVERSITY OF CHICAGO 


Factor analysis is a type of statistical theory in which a variety 
of solutions can be obtained for a given set of data. The chief methods 
for obtaining such solutions were originated mainly by psychologists, 
each of whom has urged the suitability of his particular system of 
factors for describing psychological data. The arguments for prefer- 
ence have sometimes been based upon agreement of a certain psycho- 
logical theory with a proposed factorial method. Preference for a giv- 
en method has been argued because the factors so determined have 
“psychological meaning” whereas by other methods they do not. It 
has even been claimed that a certain method yields “invariant” factors 
in some sense, while other systems of factors lack this property. Other 
arguments for choice appear to be based upon purely statistical con- 
siderations leading to mathematically elegant solutions. 

These varying points of view have been wittily described by 
Cureton as follows: 


Factor theory may be defined as a mathematical rationalization. A factor 
analyst is an individual with a peculiar obsession regarding the nature of mental 
ability or personality. By the application of higher mathematics to wishful think- 
ing, he always proves that his original fixed idea or compulsion was right or 
necessary. In the process he usually proves that all other factor-analysts are 
dangerously insane, and that the only salvation for them is to undergo his own 
brand of analysis in order that the true essence of their several maladies may be 
discovered. Since they never submit to this indignity, he classes them all as 
hopeless cases, and searches about for some branch of mathematics which none of 
them is likely to have studied in order to prove that their incurability is not only 
necessary but also sufficient.; 


In recent years, the methods of factor analysis have been success- 
fully applied not only in psychology but also in such varied fields as 
political science, business, and physical development. The justifica- 
tion of a particular method because of agreement with a psychological 
theory or psychological meaningfulness is clearly out of place for sucli 

* Address of the retiring President of the Psychometric Society at Pennsyl- 
vania State College, September, 1940 based upon a treatise “Factor Analysis” by 


Karl J. Holzinger and Harry H. Harman (in Press). 
+ Cureton, Edward E., The principal compulsions of factor-analysts, Harvard 


educ. Review, May, 1939, p. 287. 
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applications. The various methods, although first applied to psycho- 
logical data, are equally appropriate to the problems in any field where 
measurement of a set of variables for a group of individuals can be 
obtained. Factor analysis may thus be regarded as a branch of statis- 
tical theory, the particular method to be employed in a given problem 
depending upon the nature of the data and the statistical considera- 
tions leading to a choice of some preferred form of solution. 

There is a decided advantage in separating the statistical aspects 
of factor analysis from the theories in a particular field. It is then 
possible to set forth clearly the properties of each of the preferred 
types of solution, and statistical criteria leading to a choice of form 
and method. A discussion of such types and standards will be given 
below. The particular nature of the problem and data may assist in 
deciding upon the form of solution. A theory in a given field may also 
suggest a choice of form, but such a theory should not be used as a 
basis for rejecting other possible forms, which are equally good sta- 
tistically, merely because they do not appear to conform to this theory. 
“Psychological meaning,” if it means anything at all, is a rather per- 
sonal phrase. Thus, one psychologist with a flexible mind might attach 
meaning to a given solution that would not be “meaningful” to an- 
other who held rigidly to a different psychological theory. Separation 
of the statistical phase of factor analysis from the psychological should 
also help to clarify discussions on the “invariance” of a solution. It 
has been argued that a given form of solution is invariant while others 
are not, the reasoning being based in part on psychological theory. If 
invariance is taken to mean lack of change in the statistical solution, 
suck arguments are not only illogical but very confusing. The way to 
test invariance is to do it in purely statistical terms. 

From the foregoing discussion it is apparent that there is con- 
siderable freedom of choice as to form of solution. This does not mean 
that there is still lack of harmony among methods, but rather that a 
worker should be prepared to accept several different forms of solu- 
tion for the same data. Not only should factorial methods not com- 
pete with one another, but, on the contrary, they should supplement 
one another. 

A factor analysis may lead to some theory suggested by the form 
of the solution, and conversely one may formulate a theory and verify 
it by an appropriate form of factorial solution. The latter approach 
is illustrated by Professor Charles Spearman’s theory that “all 
branches of intellectual activity have in common one fundamental 
function (or group of functions) whereas the remaining or specific 
elements of the activity seem in every case to be wholly different from 
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that in all others.”* He showed that if certain relationships (tetrads) 
exist among the correlations, all the variables can be resolved into 
linear expressipns involving only one general factor and an additional 
factor unique to each variable. This furnishes the statistical verifica- 
tion of the “Two-factor Theory.” It should be observed, however, that 
Spearman’s law has been verified only in the sense that he has chosen 
a statistical solution that could be harmonized with it. A much more 
complicated law and solution could be obtained for the same data. 

As indicated above, the approach in the present paper is primari- 
ly statistical and theories in psychology, or in other fields, are not 
formulated as the bases for the factorial solutions. On the contrary, 
mathematical and statistical bases are employed in the development 
of the preferred types of solution. From the properties of these pre- 
ferred solutions, the investigator can select that one which best con- 
forms to his theory or purpose. The analyst who does not have any 
preconceived theory may select a particular form of solution by em- 
ploying the standards here set forth. He may then propose a theory 
in harmony with this solution, and attempt to verify it by further 
experiments. In general, however, the analyst may obtain a useful 
factorial solution in accordance with the statistical criteria without 
the formulation of a theory. The enumeration of these standards for 
each of the preferred solutions and the comparison of these solutions 
should enable the investigator to make an objective choice in dealing 
with a particular problem. 

The essence of factor analysis is the resolution of a set of vari- 
ables into a much smaller number of underlying categories, or common 
factors, which convey all the essential information of the larger set. 
The underlying principle involved is that of scientific parsimony or 
economy of description. In order to obtain such a solution it is neces- 
sary to assume some mathematical law connecting the variables and 
factors. Inasmuch as a linear relationship is the simplest one, and 
lends itself to the ready application of statistical methods such as 
correlation, this is the form assumed for factor anaysis. The essential 
part of the factorial solution thus consists of a series of linear equa- 
tions expressing the n variables in terms of a much smaller number 
(m) of common factors, and additional specific and unreliable factors 
as indicated by the following expression: 


2; == 0;,F, + QjoF’s +. + OjmF m i b)S; i e;T; ’ (j= 1, 2, 3, we ,n) ’ 
(1) 


where the prime denotes an approximation to the observed z;. The 


* Spearman, Charles. General intelligence, objectively determined and meas- 
ured. Am. J. Psychol., 1904, 15, 201-293. 








238 PSYCHOMETRIKA 


various methods of factor analysis are devices for obtaining the nu- 
merical values of the coefficients in such expressions under certain 
restrictions. 

It should be added that some analysts may not care to accept the 
postulation of factors as exhibited in equations (1). Some may choose 
to analyze reliability rather than communality as here indicated, while 
others may choose to analyze the unit variances, regarding all factors 
as common. In either of these cases the essential nature of the pre- 
ferred forms to be discussed remains unchanged. 

It will be helpful at this point to distinguish two geometric 
aspects of factor analysis. By way of geometric representation of a 
set of values of two variables, z; and z, it is customary to think of 
2); and z; (i= 1, 2, 3, ---,N) as the coordinates of N points in the 
plane z;0z,. This plot of points is called a scatter diagram and fur- 
nishes a clear understanding of the relations involved in the formula 
for product-moment correlation. In general, for 1 variables this will 
be referred to as the point representation. 

Even more important in some respects is a geometric representa- 
tion not by N points in a plane (for two variables) but by two points 
in an N-space. The two variables are then represented by the points 
whose coordinates are the sets of N values for each variable. The sets 
of coordinates of these points may also be referred to as vectors, and 
for the general case of » variables the configuration of vectors may be 
designated as the vector representation. It may be readily shown that 
the coefficient of correlation between two variables is the cosine of 
the angle between their vectors in N-space. 

One purpose of factor analysis is to represent the above config- 
uration of vectors in as small a space as possible, e.g., the common-fac- 
tor space. This is accomplished by the orthogonal projection of the n 
vectors in the total factor space of m+2n dimensions into the com- 
mon factor space of m dimensions. Such vectors may be denoted by 
z";. All of the present illustrations are analytic forms of vectors in 
this space. 

In the choice of a scientific hypothesis regarding the factors, two 
possibilities immediately arise from the fact that the factors may be 
taken as correlated or uncorrelated. This leads to two distinct devel- 
opments of the theory. It is possible, however, to formulate standards 
or criteria leading to appropriate forms of solution for both types of 
factors. These will be presented in detail together with practical illus- 
trations of each type of pattern. 

In order to limit the infinitude of possible factor solutions that 
can be obtained in describing a given matrix of correlations, a set of 
restrictions must be imposed to obtain some preferred reference sys- 
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tem. To this end, a list of statistical standards is presented which is 
based partially upon those found useful in previous investigations. 
Some of these standards are analytical in character while others are 
of a geometric nature. Inasmuch as both these types of standards 
are designed to produce simple forms of solution, certain of these 
standards are somewhat related. They are listed here separately, 
however, because they support one another and simplify the selection 
of preferred solutions. By presenting such a broad list of criteria, 
the assumptions underlying each type of preferred solution can be 
exhibited explicitly. 


1. Agreement with Assumed Composition of Variables 


The composition of variables postulated in equation (1) was 
based on three types of observation, namely, that correlations are 
found amongst variables of a set, that potential linkages may also 
occur amongst variables in the set with others not included, and that 
all measurement is subject to error. These considerations led to the 
fundamental equations (1) involving common, specific and unreliable 
factors. All forms of solution should obviously conform to the fac- 
torial composition of the statistical variables postulated in such linear 
equations. As indicated above, a different form of (1) is sometimes 
assumed by other analysts. 


2. Parsimony 


According to the principle of parsimony common to all branches 
of science, a law or description should be simpler than the data upon 
which it is based. This may be illustrated in the fitting of a theoreti- 
cal curve to a series of observations. The number of constants in 
such a function should be much smaller than the number of observa- 
tions in order to give a simple and useful interpretation of the latter. 
Similarly, in a factor problem the functional description of the vari- 
ables should be much simpler than the original data. 

(a) Number of common factors. — In agreement with this 
principle the total number of common factors should be considerably 
smaller than the number of descriptive variables comprising the orig- 
inal set. 

(b) Complexity. — It is also desirable that this principle of 
parsimony be applied to individual variables. In the linear descrip- 
tion of each variable the complexity or number of common factors in- 
volved should be as small as possible. 
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3. Uncorrelated Factors 


The first stage in any factorial analysis must involve a choice of 
correlated or uncorrelated factors. In the assumed composition of 
variables there is no restriction as to whether or not the factors shall 
be correlated. The observed correlations are fitted equally well by so- 
lutions involving factors of either type. Because of the indeterminacy 
of factorial solutions, investigators have found it convenient to start 
with an orthogonal solution. This form may then be retained or else 
rotated to some preferred orthogonal or oblique solution. 

The advantages of a solution involving uncorrelated factors 
arise from convenience of initial solution and subsequent interpreta- 
tion of factors. As just indicated, an orthogonal solution is the fun- 
damental form from which others may be derived. In some instances, 
however, such a solution may conform to a preferred type without 
further transformation. If uncorrelated factors are selected for the 
final solution, there is a decided clarity of interpretation, especially 
in the description of individuals in terms of factors. Such descrip- 
tions are clearer and more economical than if expressed in terms of 
interrelated factors. 


4. Relative Contributions of Factors 


Another standard which may be useful in the selection of a par- 
ticular type of solution is based upon the relative contributions of 
factors. The contribution of a factor is given by the sum of the 
squares of its coefficients for all the variables. Three useful types of 
relationships between the contributions of a set of factors follow: 

(a) Decreasing contributions. — In this case the various fac- 
tors contribute successively smaller amounts to the total communality. 

(b) Level contributions.—A second choice is one in which each 
factor contributes approximately an equal amount to the total com- 
munality. 

(c) One large and remaining level contribution. — The third 
type of relationship is that in which one factor contributes a large 
amount, while the remaining factors contribute a much smaller but 
fairly uniform amount to the total communality. 


5. Geometric Fit: Vector Representation 


The geometric interpretation of the factor problem frequently 
adds clarity to the analytical method. This is especially true in for- 
mulating a set of standards leading to several distinct types of pre- 
ferred patterns. When such distinctions have been made in geomet- 
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ric terms, the corresponding analytical properties of the factor pat- 
terns are explained. The following geometric criteria furnish the 
bases for several scientifically desirable factor patterns. 

(a) Linear fit. — An obvious justification for postulating com- 
mon factors is that the variables in a particular investigation usually 
are correlated. Furthermore, certain subsets of variables may show 
generally higher intercorrelations among themselves than with the 
remaining variables of the total set. Group factors corresponding to 
each cluster of this type may then be assumed. In a sense, then, a 
group factor may be regarded as a sort of average or the common 
element of the variables of such a subset. 

It has been noted that the variables may be regarded as vectors 
with a common origin and that the correlations between variables are 
given by the cosines of the angles between such vectors in the N-space, 
or by the scalar products of the projected vectors in the common fac- 
tor space. A group of variables yielding a cluster of high intercorre- 
lations is thus encompassed by a “cone” of vectors with a relatively 
small generating angle. If a reference axis or vector of the common- 
factor space is chosen in the midst of this cone, all variables in the 
group will tend to correlate highly with it. To a factor common to a 
group of variables, there thus corresponds a reference vector. The 
most satisfactory form of linear fit is one for which the “cone” rep- 
resenting the particular group of variables is compact, i.e., the vec- 
tors of these variables tend to approach the axis of reference. By se- 
lecting a number of such reference axes, each one passing through a 
cone of vectors, the whole configuration may finally be well fitted. 

The standard of linear fit, together with that of uncorrelated 
factors, usually can be met only roughly in the case of positive cor- 
relations among the variables. It is evident that a closer linear fit can 
generally be obtained by permitting the factors to be correlated. 

(b) Planar fit. — The type of geometric fit just described may 
also be interpreted as a planar fit. Each plane is defined by two of 
the reference axes, or by the end points of the two reference vectors 
and the origin. Good geometric fit is then indicated by the proximity 
of the vectors representing the variables in the common-factor space 
to such planes. 

A subset of variables may be well represented by vectors in a 
reference plane of the common-factor space, even though they do not 
form two distinct clusters. In this case the vectors present a fan- 
shaped configuration by means of which the reference plane is de- 
fined. The two reference axes in this plane can be selected with much 
greater freedom than when the axes are restricted to the clusters of 
vectors. 
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The equivalence of the geometric and algebraic interpretation of 
a statistical variable will now be pointed out. As indicated above, a 
variable may be considered as a linear function of, say, m common 
factors. Geometrically, this means that the vector z”; representing the 
variable lies in an m-space, which is defined by the m reference axes 
representing the factors. Thus if a vector lies in a plane, it is describ- 
able in terms of two reference vectors, and hence in the algebraic 
description of the variable only two common factors appear. Of 
course, other variables of the set might involve these or other factors. 

(c) Hyperplanar fit. — In the two preceding standards, one- 
spaces and two-spaces were the bases for determining the adequacy 
of geometric fit. This idea can be extended to higher spaces. By hy- 
perplanar fit in a space of m dimensions is meant that each vector 
representing a variable in the common factor space lies in an (m — 1), 
or smaller, space. 

When a set of variables satisfies this standard, the complexity of 
any one of them is less than the total number of common factors. This 
does not appear as a very stringent criterion at first sight, because it 
is satisfied if each variable is of complexity (m — 1) for m common 
factors. The strength of this standard, however, lies in the fact that 
the hyperplane is the largest permissible space containing each vari- 
able. In other words, it is hoped that there will always be smaller ref- 
erence spaces which contain certain subgroups of variables. Thus the 
complexities of the variables are reduced below (m — 1). In particu- 
lar, a vector which lies on a reference axis is contained in that one- 
space and the variable it represents has a complexity of one. Simi- 
larly, if a vector lies in a reference plane, the variable is of complex- 
ity two. This analogy of the smallest reference space in which a vec- 
tor is contained and the complexity of the variable can be extended to 
any degree. 

It is evident that the two preceding geometric standards may be 
considered as special cases of hyperplanar fit. For, if a set of vari- 
ables satisfies the criterion of linear fit, or planar fit, it certainly con- 
forms to hyperplanar fit. The converse, of course, is not true gen- 
erally. Hence the ideal to be aimed for is to reduce the hyperplanar 
fit to a geometric fit of as small a number of dimensions as possible, 
the limit being linear fit. 


6. Geometric Fit: Point Representation 


Two alternative geometric representations of a set of variables 
were pointed out above. By employing the vector representation, 
three standards were immediately evident which will lead to as many 
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types of preferred factor patterns. Now by considering the point rep- 
resentation, another standard evolves. It will be recalled that in this 
representation there is one point for each of the N individuals, re- 
ferred to a system of n reference axes — one for each variable. The 
points which are plotted in this n-space are contained in a common- 
factor space of only m dimensions. The loci of the swarm of points 
of uniform frequency density are, more or less, concentric, similar, 
and similarly situated m-dimensional ellipsoids, being exactly so for 
a normally distributed population.* It then seems natural to take the 
principal axes of these ellipsoids as the fundamental reference axes. 
This standard, which is called ellipsoidal fit, leads to another pre- 
ferred type of factor pattern.} 

To these criteria may be added that of statistical stability after 
solution has been obtained. From a statistical point of view, several 
of the preferred patterns may fit a matrix of observed correlations 
equally well. The ultimate choice of type of factor pattern must then 
rest, in part, upon the nature of the variables and the utility of the 
solution in the particular field of investigation. 

Formal descriptions of various preferred types of factor pattern 
are presented in the text, but for the present paper simple numerical 
illustrations are given of such patterns. Each of these illustrations 
should be regarded as the analytic expression of the variables pro- 
jected in the common-factor space. 

The first type of pattern has been called the uni-factor form. Its 
essential characteristic is that each variable has a complexity of one. 
Inasmuch as factors describing the variables in the sense of linear 
feet are not generally orthogonal to each other, the ideal uni-factor 
form is not to be expected when uncorrelated factors are assumed. 
By employing correlated factors, however, very good approximations 
to such a form may sometimes be obtained. This is illustrated by Pat- 
tern I which is based upon thirteen psychological tests briefly de- 
scribed at the end of this paper. 

The structure values, or correlations of the tests with factors, are 
also recorded, inasmuch as these throw further light on the interpre- 
tation of the factors and are useful for their estimation. A brief dis- 
cussion of the naming of factors will be given at the end of this paper. 

The bi-factor pattern is essentially a modification of the preced- 
ing type. It consists of a general factor and a number of mutually 

* Yule, G. Udny and Kendall, M. G. An introduction to the theory of sta- 
og London: Charles Griffin & Co., Limited, 1937. Eleventh Edition. Chap- 

+ The application of higher-dimensional ellipsoids in the field of factor analy- 


sis was proposed by Truman L. Kelley at a meeting of the Unitary Traits Com- 
mittee in 1933. 
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TABLE I 
Oblique Uni-factor Solution 






























































Structure i Pattern 
Test | | Spatial | Verbal! Speed 
,.) oe "iy, | THs) YN, Ye | Ys 
1 14 AT | 36 | .70 05 | 02 
2 5 26 | 20| a ao 
3 59 28 22 | 65 | —08 | —.04 
4 56 85 27 58 04 02 
5 50 80 | 38 04 7 | 02 
6 AG 80 | 325] —01 81 | —.01 
7 A4 86 | 34) —.08 92 | —.04 
8 55 70 | .42 18 55 | 09 
9 | 41 | 85 | 81) —12 | 95 | —06 
10 19 | «ag | 71 |} —11 | —12/ 81 
11 | .32 338 | .67| 02 | .02 65 
12 26 | .27 | .72| —.05 | —.05 77 
13 | 49 | 50 | 72 | 14 14 .60 
TABLE 2 
Bi-factor Pattern 
| General | Spatial | Verbal Speed | 
Test B, | 8B, B, B, | Communality 
1 61 | A8 56 
2 34 | 30 | 20 
3 ha 36 
4 46 | 32 | 31 
5 65 | | 47 65 
6 60 | 58 64 
7 58 | 65 15 
8 72 24 57 
9 58 | 69 76 
10 124 | 70 55 
11 42 | | 52 45 
12 35 | | | 64 58 
13 64 | | |__-48 60 
Contribu- | | | | 
tion of | 
factor | 3.54 | 0.60 | 1.44 1.36 6.94 








exclusive group factors as illustrated by Table 2. All of the factors 
have been taken as orthogonal, and the complexity of each of the 
variables is now two. The additional general factor has made possible 
a more convenient interpretation of the variables in terms of uncor- 
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related variables, but at the sacrifice of greater simplicity of the uni- 


factor form. 

Again assuming uncorrelated factors, a solution of the multiple 
factor type may be obtained. This is illustrated by the solution of 
Table 3, in which three common factors appear. Such a solution in- 


TABLE 3 
Multiple-factor Pattern 




















Spatial Verbal | Speed 
Test M, M, | M, Communality 
1 64 17 22 49 
2 41 10 12 20 
3 55 14 .03 32 
4 50 .20 14 31 
5 21 72 82 66 
6 26 74 21 65 
7 20 -76 24 68 
8 35 58 .29 54 
9 20 77 19 67 
10 | —,13 15 70 53 
11 11 13 67 48 
12 16 —.05 68 49 
13 42 09 63 58 
Contribution 
offactor 1.68 2.70 2.20 6.58 











volves greater parsimony in the number of common factors as com- 
pared with the bi-factor form, but often fails to meet the standard 
of low complexity for the individual tests, as in the present case. 

The last type of preferred pattern is the principal-factor solution. 
For the above data the solution of this form is given in Table 4. All 
factors are again taken as orthogonal. A distinguishing character- 
istic of this type of pattern is that the coefficients of all factors except 
the first are both positive and negative. These factors will be called 
“bi-polar” factors, a term first introduced by Cyril Burt.* 

Such a factor may appear in all variables of a set or only in a 
subgroup of them. A bi-polar factor is not essentially different from 
any other but is merely one for which several of the variables have 
significant negative projections. Such variables may be regarded as 
measuring the negative aspect of the usual type of factor. Thus if a 
number of variables identified with “fear” are represented by posi- 
tive projections, variables with negative projections might be inter- 


yril. The factorial analysis of emotional traits. Character and Per- 
sie 1889; 7 238-254, 285-299. 
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TABLE 4 


Principal-factor Pattern 

















ei =. P,, Communality 
1 | 67 —.14 AT 57 
2 184 —01 .29 .20 
3 40 —.08 6 38 
4 AT —.02 82 82 
5 . = —21 | al .66 
6 74 —.30 —11 65 
7 .76 —32 | —.20 42 
8 72 —i 0 fll 54 
9 74 —36 | —19 2a 
10 A4 +49 | —41 .60 
11 51 +42 | —14 6 
12 | 4 +.59 —.04 54 
13 | 63 +.48 15 61 
Contribu- 
tionof | 
factor | 4.66 1.37 0.94 6.97 














preted as measuring “courage.” It would appear simpler, however, 
to regard the factor merely as “fear,” and the opposing set of vari- 
ables as measures of “negative fear.’”’ Of course, the signs of all the 
coefficients of the factor may be changed without altering the ade- 
quacy of the solution. Such reversal in the foregoing example would 
lead to the interpretation of the factor as “courage,” and the sub- 
group of variables with negative coefficients would be regarded as 
measuring “‘negative courage’. 

It will be observed that no names appear for the factors of Table 
4, but this is merely because no simple names for the two bi-polar fac- 
tors are clearly evident. The writer is of the opinion that the naming 
of factors immediately after a solution is obtained is usually an un- 
necessary and unwise practice. The factors are defined in part by the 
very nature of the mathematical processes leading to a particular so- 
lution. To attach a name merely from the variables which have high 
coefficients for various factors would seem to be far from clear. Thus 
the term “spatial” has been used in each of the four solutions above, 
because four “spatial” tests have the highest coefficients in each case. 
The symbol under the name has been employed, however, to indicate 
that the factors, although somewhat the same in nature, are essen- 
tially different, due to the varying form of solution. 

Inasmuch as psychologists in particular appear to be eager to 
name factors, the rationale for such naming may be briefly discussed 
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for the case of the bi-polar type. It would seem desirable that whatever 
name is selected for a bi-polar factor, it should have a clearly recog- 
nizable negation. A more fundamental approach is to find a basic 
term which connotes the entire continuum. For example, a bi-polar 
factor which is named “Heat” (or, “Cold”), would have the opposite 
characteristic “Cold” (or, “Heat’). A name representing both of 
these characteristics is “Temperature.” These two approaches may be 
indicated schematically as in Figure 1. 








(a) Cold Heat 
0 
(b) , 
0 Temperature 
FIGURE 1 


An attempt to give such a name to a bi-polar factor is illustrated 
by the data of Table 5. The first four variables appear to measure 
“lankiness,” while the last four measure “stockiness.” Inasmuch as 
these two terms are not clearly distinguishable as opposites accord- 
ing to (a) of Figure 1, neither of these terms seems to be an appropri- 
ate name for the bi-polar factor. In an attempt to get a name of type 
(b), which transcends the specific description of the variables, the 
term “body type” (BT) has been adopted. On this continuum, vari- 
ables describing two body types have projections of opposite sign. 
The name selected may not be the most fortunate, but it is believed 


TABLE 5 
Principal-factor Pattern for Eight Physical Variables 














Variable G BT Communality 
1. Height 86 anil 84 
2. Arm Span 85 —.41 89 
3. Length of Forearm 81 —.41 83 
4, Length of Lower Leg | .838 —.34 80 
5. Weight | .75 56 87 
6. Bi-trochanteric | 

Diameter 64 51 .66 
7. Chest Girth .56 49 .55 
8. Chest Width 62 OT 52 

Contribution of 

factor 4.46 1.51 5.97 














that the manner of naming is more correct than the usual procedure. 
Returning to the preferred types of pattern, it may be observed 
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that the criteria set forth do not furnish the complete basis for a 
choice of form. This choice will also depend upon the nature of the 
variables and the theories or laws in a particular field of application. 
Thus, if an investigator considers the variables to be of the bi-polar 
type, the principal-factor form would be appropriate as in the last 
illustration. In psychology, if a general factor is denied in accord- 
ance with some theory, then the multiple-factor type would be con- 
sistent; but if the general factor is accepted by an alternative theory, 
a form such as the bi-factor might be appropriate. According to a 
third theory an analysis might be desired in which the factors appear 
in order of importance. This would be furnished by the principal- 
factor solution. The formally different solutions resulting from these 
three hypotheses throw no light, of course, on the correctness of the 
three theories postulated, but are merely consistent with each in turn. 

In factor analysis as in all empirical sciences, several equally 
satisfactory laws may be usefully employed, although they may be 
formally quite different. A somewhat analogous situation arises in 
that branch of Physics known as Hydrodynamics. A clear discussion 
of this problem is given by Horace Lamb as follows: 


The equations of motion of a fluid have been obtained in two different forms, 
corresponding to the two ways in which the problem of determining the motion of 
a fluid mass, acted on by given forces and subject to given conditions, may be 
viewed. We may either regard as the object of our investigations a knowledge 
of the velocity, the pressure, and the density, at all points of space occupied by 
the fluid; or we may seek to determine the history of every particle. The equa- 
tions obtained according to these two approaches are called the “Eulerian” and 
“Lagrangian” forms of hydrokinetic equations. 


In the case of factor analysis the guides as to choice of form are 
not so clear, but at the present stage of development it is urged that 
psychologists, and particularly members of the Psychometric Society, 
see in any form selected only a variation of one fundamental method 
of analysis. They should therefore stop trying to find the universally 
“right” solution. From criteria such as those listed above a worker 
may select a particular form of solution in harmony with his own the- 
ory or purpose, and he should be free to admit that some other form 
might give a solution equally satisfactory from a statistical point of 
view. It might also be very helpful to discard such terms as “‘psycho- 
logical meaning,” “fundamental abilities,” and “unitary traits” until 
at least a considerable number of psychologists agree upon the mean- 
ing of these words. A mathematically defined alpha may be more 
meaningful than a verbalized description from the content of the 
tests. 
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Some Standards for Judging Preferred Patterns 


1. Agreement with Assumed Composition of Variables 


2. Parsimony 
Number of common factors 


Large and successive level contributions 
5. Geometric Fit: Vector Representation 


a) 
b) Complexity 
8. Relations among Factors 
a) Uncorrelated 
b) Correlated 
4. Relative Contributions of Factors 
a) Decreasing contributions 
b) Level contributions 
c) 
a) Linear fit 
b) Planar fit 
c) Hyperplanar fit 


6. Geometric Fit: Point Representation 
(Ellipsoidal fit) 


TABLE 6 


Assumptions and Properties of Preferred Patterns 




















Number of | Complexity 
Type of Assumptions Common of Each Distinguishing 
Pattern Factors Variable Characteristics 
Uni-factor Distinct group 
factors 
Orthogonal 1, 2a, 2b, 3a, 4b, 5a m 1 
Oblique 1, 2a, 2b, 3b, 4b, 5a m 1 
Bi-factor 1, 2a, 2b, 3a, 4c, 5b) mor m+1 2 One general plus 
group factors 
Multiple- Overlapping group 
factor 1, 2a, 2b, 3a, 4b, 5c m <m factors 
Principal- One general plus 
factor 1, 2a, 3a, 4a, 6 m m bipolar factors 








Brief Descriptions of Psychological Variables 


1. Visual Perception Test. A non-language multiple-choice test composed 
of items selected from Spearman’s Visual Perception Test, Part III. Testing 
time: 19 minutes. 

2. Cubes. A simplification of Brigham’s test of spatial relations. Testing 
time: 8 minutes. 

8. Paper Form Board. A revised multiple-choice test of spatial imagery, 


with dissected squares, triangles, hexagons, and trapezoids. 


minutes. 


Testing time: 8 
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4. Flags. Adapted from a test by Thurstone. Requires visual imagery in 
two or three dimensions. Testing time: 5 1/2 minutes. 

5. General Information. A multiple-choice test of a wide variety of simple 
scientific and social facts. Testing time: 18 minutes. 

6. Paragraph Comprehension. Part III of Traxler Silent Reading Test, 
Form 1, for Grades 7 to 10. Comprehension measured by completion and mul- 
tiple-choice questions. Testing time: 20 minutes. 

7. Sentence Completion. A multiple-choice test in which “correct” answers 
reflect good judgment on the part of the subject. Testing time: 6 minutes. 

8. Word Classification. Arranged by M. A. Wenger. Sets of five words one 
of which is to be indicated as not belonging with the other four. Testing time: 
10 minutes. 

9. Word Meaning. Part II of Traxler Silent Reading Test. A multiple- 
choice vocabulary test. Testing time: 14 minutes. 

10. Add. Speed of adding pairs of one-digit numbers. Testing time: 2 min- 
utes. 

11. Code. A simple code of three characters is presented and exercise therein 
given to measure perceptual sped. Testing time: 2 minutes. 

12. Counting Groups of Dots. Four to seven dots, arranged in random pat- 
terns, to be counted by the subject. A test of perceptual speed. Testing time: 
4 minutes. 

18. Straight and Curved Capitals. A series of capital letters. The subject is 
required to distinguish between those composed of straight lines only and those 
containing curved lines. A test of perceptual speed. Testing time: 3 minutes. 
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The relations of abilities, as measured by Thurstone’s Tests for 
Primary Mental Abilities, to activity preferences, as measured by 
Kuder’s Preference Record, are investigated for a population of 512 
university freshmen. Ability profiles for contrasted groups on each 
preference scale reveal relatively slight overlapping between the two 
sets of measures, although the apparent trends are reasonable. The 
Pearson inter-correlation coefficients of all pairs of measures in- 
volved were determined. Implications of the findings in relation to 
theory and to educational and vocational guidance are indicated. 


To what extent are one’s abilities related to the types of activ- 
ities which he prefers? The answer to this question is important from 
the standpoint of its bearing on the theory of mental measurement 
and from the standpoint of its implications for vocational and edu- 
cational guidance. 

A number of investigators have attacked the problem of the re- 
lation of interests to abilities. The measures of interests have ranged 
from estimates of interests in general fields to scores determined from 
inventories of very specific interests. The ability scores have been 
measures or estimates of general intelligence, of abilities in specific 
subject-matter fields, or of mechanical abilities. In general, high cor- 
relations have been reported between self-estimates of interests and 
self-estimates of abilities, particularly when these estimates were 
made in retrospect. The use of self-estimates of interests and abil- 
ities, made at the same time, has obvious defects. Substitution of in- 
terest inventories and objective measures of abilities for more sub- 
jective estimates has seemed to indicate only negligible relations be- 
tween the interests and abilities measured. These conclusions, of 
course, are limited to the particular ability and interest fields studied. 
Those desiring a survey of previous research in this field are referred 
to Fryer’s Measurement of Interests.* 


161, Fryer, D. H. Measurements of interests. New York: Henry Holt and Co., 
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The present study is an investigation of the relation of abilities, 
as measured by the Tests for Primary Mental Abilities of L. L. Thur- 
stone, and preferences for certain types of activities, as measured by 
the Preference Record of G. Frederic Kuder. This study is confined 
to the relations of preferences to abilities and is not concerned with 
the relations of measures in either of these fields to success. Such a 
study should reveal whether it is worth while to measure both abil- 
ities and preferences, or whether one can more profitably confine his 
measurement to one field alone. Strictly speaking, the conclusion will 
apply only to the measuring instruments in question. To consider one 
of the extreme possibilities, if the correlation between a given pref- 
erence scale and a given primary composite were unity, we should, in 
the interests of economy of measurement, dispense with one of the 
two measures, since only one of the two would predict success as well 
as both together. On the other hand, if little overlapping between a 
measure of preference and a measure of ability is revealed, it is there- 
by demonstrated that the two instruments are measuring essentially 
different things. 

The experimental edition of Thurstone’s Tests for Primary Men- 
tal Abilities was given to University of Chicago freshmen in Septem- 
ber, 1938. This edition yields scores on seven primary composites, 
Perception, Number, Verbal, Space, Memory, Induction, and Reason- 
ing. The tests included in each of these composites are as follows: 


Symbol Factor Tests 

P Perception Identical Forms 
Verbal Enumeration 

N Number Addition 
Multiplication 

V Verbal Completion 
Same-Opposite 

S Space Cards 
Figures 

M Memory Initials 
Word-Number 

I Induction Letter Grouping 
Marks 


Number Patterns 


D Reasoning Arithmetic 
Mechanical Movements 
Number Series 
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These factors may be briefly described as follows: The factor P 
“seems to be a perceptual ability which enables some people to excel 
in finding detail which is significant to them or detail which they are 
seeking.” Factor N “consists in facility with simple numerical work 
and is best represented in the tests of rapid calculation.” Factor V 
“is characterized by tests that involve the interpretation of language.” 
“The factor S is present in those tests which require the subject to 
think visually of geometrical forms and of objects in space.” The fac- 
tor M “can be tentatively named the ability to memorize.” “The pres- 
sent interpretation of the factor J is that it represents the ability to 
make inductive generalizations.” The reasoning factor “‘is one of sev- 
eral factors that may be involved in restrictive thinking. As a gen- 
eral description, it seems to represent facility in formal reasoning.”* 

It should be observed that the composite scores are not pure 
measures of the factors in question, since, in the estimation of each 
factor score, certain approximations have been resorted to. Conse- 
quently, we cannot expect the composite scores to be uncorrelated as 
the more rigorous estimates would tend to be. However, these scores 
may be considered to be relatively independent, the greatest overlap- 
ping involving only slightly more than one-fourth of the variance. 

The same University of Chicago freshmen filled out an experi- 
mental edition of a new Preference Record}; in which the preference 
form of item is used. For each item, a subject is asked to indicate 
which one of two activities he prefers, as illustrated in the following 
examples: 


A. (1) Be the business manager of a play 
(2) Act in a play 


B. (1) Be the treasurer of a club 
(2) Be the secretary of a club 


C. (1) Work in a garden 
(2) Sketch an interesting scene 


The characteristics of this type of item have been described else- 
where.t 


_. * Thurstone, L. L. Manual of Instructions for Tests for Primary Mental Abil- 
ities, Washington: The American Council on Education, 1938. 
+ Kuder, G. Frederic. Preference Record. Chicago: Univ. Chicago Bookstore, 
9. 


t Kuder, G. Frederic. The stability of preference items. J. soc. Psychol., 1989, 
10, 41-50. 
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This Preference Record was differentially scored for eight types 
of activities, namely, scientific, computational, musical, artistic, liter- 
ary, persuasive, athletic, and social prestige. It should be noted that 
the measures obtained from the experimental edition of the Prefer- 
ence Record are not quite the same as those obtained from the present 
edition. The experimental edition was not scored for a “social serv- 
ice” scale, as is the present edition, but included athletic and social 
prestige scores which are not obtained from the current edition. 

The records of 312 men and of 200 women were studied. To al- 
low for any sex differences which might occur, results for the two 
groups were analyzed independently. For each sex and for each pref- 
erence record score, our method was as follows: (1) On the basis of 
the preference score, the upper twenty-seven per cent and the lower 
twenty-seven per cent of the group were selected; (2) for each of 
these two extreme groups, the mean score on each primary com- 
posite was determined in terms of standard scores based on the en- 
tire group. This procedure was followed for each preference vari- 
able, and the results were plotted in the form of profiles. 

For each sex, Pearson intercorrelation coefficients of all measures 
used were obtained. In addition to the correlations between the pref- 
erence measures and the primary composites, we thus have the corre- 
lations among the preference scores and among the ability scores. 
Since the standard errors of the correlation coefficients can be esti- 
mated readily and used to give an adequate idea of the significance of 
a relation, critical ratios for the differences between the high and low 
groups have not been computed. 

The ability profiles for the contrasted groups in each of the pref- 
erence variables are presented on the following three pages. In each 
graph, the solid line connects the mean ability scores for the upper 
twenty-seven per cent in the indicated preference variable, and the 
broken line connects the mean ability scores for the lower twenty- 
seven per cent in the preference variable. It should be remembered 
that these means are expressed as deviations from the general means 
in standard deviation units. Along the base of each graph are given 
the corresponding numerical data and, for purposes of comparison, 
the correlations of the given preference variable with each primary 
composite. The correlations are based on the entire group. In gen- 
eral, the correlations are positive when the high group exceeds the 
low group and negative when the low group exceeds the high. Minor 
exceptions to this rule must be the result of departure from linearity 
in the relations of the variables involved. 

The trends will be discussed separately for each preference vari- 
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1. Scientific. No striking differences between the upper and low- 
er groups are revealed for the Scientific scale for either sex. The 
largest difference in mean primary composite scores for the two 
groups is on the Reasoning composite for women, the correlation of 
Reasoning with the Scientific scale being .14. It may also be noted 
that the women high on the Scientific scale, as compared with those 
low on the scale, excel on every ability composite except Memory, on 
which the difference is reversed and negligible. 

2. Computational. For both sexes, the correlations of the Compu- 
tational scale with three primary abilities composites are positive and 
significant. The three composites are Number, Induction, and Rea- 
soning. The outstanding difference of means is that on the Number 
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composite for women; the corresponding correlation, .39, is by far the 
highest of the correlations between the preference scales and ability 
composites investigated. In this case, the difference in the means for 
the two contrasted groups is approximately one standard deviation. 
For men, preference for computational activities is not so closely re- 
lated to computational facility, though here again the correlation is 
quite high. It may be pointed out that both the Induction and Rea- 
soning composites contain at least one test which is concerned with 
the manipulation of numbers in some way. The Induction composite 
includes the Number Patterns Test, while the Reasoning composite 
contains the Number Series Test and the Arithmetic Test. Hence it 
is not surprising that the relations of these two composites with the 
Computational scale should be sizeable. 

8. Musical. An interesting sex difference appears in the case of 
the Musical scale. For the women, the ability profiles for the “musi- 
cal” and “non-musical” groups are quite similar. The “musical” men, 
however, tend to exceed the “non-musical” men on five of the seven 
primary composites: Perception, Verbal, Space, Induction, and Rea- 
soning. The appearance of a sex difference for this scale may reflect 
differences in exposure to musical training. It seems likely that girls, 
who are in general expected to know something about music, are sub- 
jected to a certain amount of musical training irrespective of their 
abilities or special talents, whereas only boys who are above average 
in ability undertake sufficient training in music to enhance their in- 
terests. 

4. Artistic. It is first to be noted that the trends for the Artistic 
scale are quite similar for men and for women. The outstanding dif- 
ference of mean primary composite scores for the upper twenty-seven 
per cent as compared with the lower twenty-seven per cent is that 
for the Number composite, the low group exhibiting greater numer- 
ical facility than the high group. The four composites on which the 
“non-artistic” exceed the “artistic” each include at least one test 
which requires some sort of manipulation of numbers. This observa- 
tion is consistent with the everyday observation that persons engaged 
in artistic pursuits do not customarily think in quantitative terms. 

5. Literary. The differences in the mean composite scores for 
the contrasted groups on the Literary scale are fairly pronounced. As 
would be anticipated, the highest relation is with the Verbal com- 
posite, on which, for both men and women, the “literary” group ex- 
ceeds the “non-literary” group. The correlation of the Literary scale 
with the Verbal composite is .27 for both sexes. For women, the cor- 
relation between the Literary scale and Memory is positive and fairly 
high, while the correlation for men is slightly negative. This incon- 
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sistency in the trends for the Memory composite represents the prin- 
cipal sex difference for the Literary scale. Perhaps enjoyment of the 
types of literature preferred by women is more dependent upon the 
ability to memorize than is the case for men. Or it may be that wom- 
en with a high degree of memorizing ability find it easy to excel in 
literary activities and thus tend to develop literary interests to a 
greater degree than do men with superior memories. 

6. Persuasive. The Persuasive scale is negatively related to all 
the primary abilities composites in both the male and female groups. 
The difference between the means is most striking in the case of the 
Verbal composite for men, the corresponding correlation being —.21. 
At first thought it might be expected that the Persuasive scale would 
be positively related to the Verbal factor, but it should be remem- 
bered that this Verbal factor is more concerned with verbal reason- 
ing than with verbal fluency. A measure of Thurstone’s Word factor, 
W, might have a positive correlation with the Persuasive scale. The 
nature of the items comprising the Persuasive scale also has a bear- 
ing on the sign of the correlation with the Verbal factor. The items 
are concerned largely with salesmanship and related activities, and 
we might expect persons high on the Verbal composite to prefer ac- 
tivities of a more “intellectual” nature than selling and the like. 

7. Athletic. The remarkable feature of the profiles for the Ath- 
letic scale is that the composite primary abilities scores for men low 
on this scale are in each case above those for men high on the scale. 
However, those men who are high on the Athletic scale are conspicu- 
ously below the average for the entire group on only one of the pri- 
mary composites, the Verbal composite, for which the largest differ- 
ence between the high and low groups occurs. The correlations of 
scores on the Athletic scale with the primary composite scores hover 
around zero, excepting that for the Verbal composite, which is —.26. 
The trend suggests that, among University of Chicago freshman men, 
the expression of athletic preferences may be a compensation for a 
low level of abilities, particularly for a low degree of verbal ability. 
For women, no outstanding differences occur. 

8. Social Prestige. The Social Prestige scale has the highest re- 
lation in the male group with the Number composite, the relation be- 
ing negative. For women, there is a negative relation with the Rea- 
soning composite. Neither of these findings has any ready explana- 
tion, though the correlations in question approach statistical signif- 
icance. 

The intercorrelations of all measures used in this study are pre- 
sented in the table. The first coefficient in each cell is for the 312 men; 
the second coefficient is for the 200 women. 
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The correlations of the preference records and the primary com- 
posites, which have been considered in our previous discussion, ap- 
pear in the upper right-hand and lower left-hand sections. It is no- 
table that only one correlation above .30 appears, namely, one of .39 
for women between the Number composite and the Computational 
preference scale. There are three cells in which the correlations for 
both sexes are above .20; these are the correlations between the Ver- 
bal composite and the Literary preference scale, and for the Compu- 
tational preference scale with both the Induction and the Reasoning 
composites. Two other correlations exceeding .20 in absolute magni- 
tude occur for men; these are —.264 between the Verbal composite 
and the Athletic preference scale and —.213 between the Verbal com- 
posite and the Persuasive preference scale. 

The intercorrelations of the primary composite scores are pre- 
sented in the lower right-hand section of the table. All of these corre- 
lations are positive. For men they range from .010 to .564 and, for 
women, from .078 to .556. For both groups the smallest correlation 
is between the Perception and Memory composites, and the highest 
correlation is between the Induction and Reasoning composites. In 
general, the Memory composite has the lowest average correlation 
with the other primary composites, and the Induction and Reasoning 
composites have the highest average correlations with the other pri- 
mary composites. A fairly close agreement between the correlations 
for the two sexes is evident. 

The intercorrelations for the preference scales appear in the up- 
per left-hand section of the table. For men, they range from —.510 
to .433; for women, they range from —.379 to .325. For both groups, 
the highest negative correlation is that between the Literary scale 
and the Athletic scale; and the highest positive correlation is between 
the Social Prestige scale and the Athletic scale. Unlike the intercorre- 
lations for the abilities tests, those for the preference scales are nega- 
tive as well as positive. This result is to be expected from the nature 
of the items composing the scales. 

It is apparent that there is, on the whole, relatively slight over- 
lapping between the measures of ability and the preference measures. 
The trends of the relations which do appear, with one or two excep- 
tions, may definitely be said to be in line with our expectations. How- 
ever, the interpretation of preference scores as indicative of the pres- 
ence or absence of special abilities is unwarranted by the results of 
this investigation. It again should be noted explicitly that the rela- 
tion of either preferences or abilities to achievement or success in any 
pursuit is not investigated in this study. If it can be demonstrated 
that measures in each of these domains have prognostic value for cer- 
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tain criteria of success, it would appear that a combination of the two 
sorts of measures ought to be more effective than measures in either 
of the two fields alone. 

The writers are grateful to Professor L. L. Thrustone, who sug- 
gested this study, for several critical discussions of the problem. 
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AN EMPIRICAL STUDY OF THE EFFECT OF HETEROGENE- 
OUS WITHIN-GROUPS VARIANCE UPON CERTAIN 
F-TESTS OF SIGNIFICANCE IN ANALYSIS 
OF VARIANCE 


R. H. GODARD AND E. F. LINDQUIST 
THE STATE UNIVERSITY OF IOWA 


In the application of the analysis of variance to data obtained 
in educational methods experiments which involve several classes of 
several schools, one assumption is that of homogeneity in the vari- 
ances of pupil scores from school to school. It is shown that such 
variances on representative educational achievement tests are hetero- 


geneous. The effects of this heterogeneity upon the F’-tests of signifi- 
cance commonly employed in methods experiments are investigated 
by comparing the actual distribution of F values for a large number 
of “experiments” involving marked heterogeneity with a theoretical 
distribution based on the assumption of homogeneity. Although the 
findings, which vary somewhat with the type of variance ratio, are 
not entirely conclusive, they apparently demonstrate that departure 
from homogeneity does not invalidate the use of the customary F- 
tests for evaluating results of the typical methods experiment. 


I. INTRODUCTION 


A large proportion of all methods experiments in education are 
of the type which involve several schools. The typical experimental 
design in such experiments is that in which the pupils within each 
school are divided into as many equal classes as there are methods to 
be evaluated, one for each method. The classes are then taught by 
the respective methods for a given period of time, and an effort is 
made to equalize or hold constant as many as possible of the extrane- 
ous factors which may influence pupil performance. At the close of 
the experiment, a single criterion test of achievement is administered 
to all pupils in all schools and classes, and the methods are then evalu- 
ated on the basis of the mean achievement of all pupils under each of 
the methods. 

The method of analysis which is appropriate with this experi- 
mental design is known as the method of analysis of variance. The 
reader who is familiar with this method will recall that this method 
involves analyzing the total variance into the components due to (1) 
differences between methods means (M), (2) differences between 
school means (S), (3) differences between class means after school 
and method differences have been eliminated (M X S), and (4) varia- 
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tions in pupil scores within classes (WCl). The results of the analysis 
are then summarized in a form like the following: 


Source of Degrees of Sum of 
variation freedom squares Variance 
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From this point in the analysis, the first step would be to test 
for a significant interaction of methods and schools. If the variance 
ratio, F = M X S/WCL, is significant, this is taken as evidence that 
the relative effectiveness of the methods differs from school to school, 
or that the methods which are best in certain schools may not be the 
best in others. This F-test is based on the assumptions of random 
assignment of the pupils to the classes and of homogeneity in the 
variances of the pupil scores from school to school (with method dif- 
ferences eliminated). A significant F in this test may be due to the 
fact that certain methods are really more effective in some schools 
than in others, or it may be due to extraneous variables which have 
not been equalized or controlled from class to class within the various 
schools. If either or both of these factors are present, they alone 
might account for the differences in methods for the whole experi- 
ment. In other words, the rank order of the methods for the particu- 
lar schools involved in the experiment might not be the same as their 
order in the entire population of schools involved. If these factors 
have been randomized, that is, if the classes within each school have 
been randomly assigned to the methods, then one may test the hy- 
pothesis that the methods are equally effective for the population of 
schools by comparing the methods variance with the methods X 
schools variance by means of the F-test to determine if the variance 
ratio, F = M/M X S,, is significant. This test of significance assumes 
a random sample of schools, random assignment of the methods to 
the classes within each school, and homogeneity in the variances of 
class means (with school and method differences eliminated) from 
method to method. If the test based on the M X S/WCI variance ra- 
tio shows that there is no significant interaction, then, on the hypothe- 
sis that there is no real interaction, one may test the further hy- 
pothesis that the differences in methods means are due entirely to 
the random assignment of pupils to the classes. This test assumes 
random assignment of the pupils to the classes, random assignment 
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of the methods to the classes, and homogeneous within-classes vari- 
ance from school to school. This latter test (F = M/WCl) is rarely 
applied, since ordinarily one would be reluctant to assume that there 
is no real interaction, even though the observed interaction were not 
significant. 

One of the essential assumptions in this analysis is that of homo- 
geneity in the variances of pupil scores from school to school. The 
following section will show that the variances of pupil scores on edu- 
cational achievement tests are actually heterogeneous from school to 
school. It is important, then, to know just what the effect of this 
known heterogeneity is upon the F-tests of significance commonly em- 
ployed in methods experiments in educational research. The writers 
therefore created a situation (described in section III) in which this 
assumption of homogeneity was not satisfied and then determined its 
effect upon the three F'-tests based on the variance ratios, M x S/WC1, 
M/M XS,and M/WC1. 


II. OBSERVED HETEROGENEITY OF VARIANCE 
AMONG SCHOOLS 


This section will show that the variances of pupil scores on rep- 
resentative educational achievement tests are actually heterogeneous 
from school to school. The data used to support this conclusion are 
the scores obtained from a large number of schools in the 1937 and 
1938 Iowa Every-Pupil Testing Programs. In these programs, care- 
fully constructed tests of educational achievement were administered 
under the same carefully controlled conditions to every pupil regu- 
larly enrolled in each grade or subject tested in the junior and senior 
high schools of the state. 

The Basic Skills tests were administered in grades six, seven, and 
eight. These tests (1) comprise a forty-eight page battery, the parts 
of which are indicated in the following outline: 


Title Total testing time 
Test A: Silent Reading Comprehension 65 minutes 
Part I: Paragraph Comprehension 
Part II: Understanding of Significant mene 
Part III: Organization of Ideas 
Part IV: Grasp of Total Meaning of a Selection 


Test B: Work-Study Skills 85 minutes 
Part I: Vocabulary (9 minutes) 
Part II: Comprehension of Maps 
Part III: Reading of Graphs, Charts and Tables 
Part IV: Use of Basic References 
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Part V: Use of an Index 
Part VI: Use of the Dictionary 


Test C: Basic Language Skills 75 minutes 
Part I: Spelling 
Part II: Sentence Sense 
Part III: Punctuation 
Part IV: Capitalization 
Part V: Usage 


Test D: Basic Arithmetic Skills 90 minutes 
Part I: Fundamental Processes 
Part II: Verbal Problem Solving 
Part III: Ability to Verbalize 
Part IV: Recognizing and Correcting Errors 
in Written Work 


The high-school program involved the administration of a sixty- 
minute objective achievement test (2) in each of the fundamental 
high-school subjects, as follows: ninth-year algebra, plane geometry, 
general science, biology, physics, world history, United States history, 
American government, first- and second-year Latin (a test of read- 
ing comprehension only), English correctness, reading comprehension 
in literature, and contemporary affairs. — 

The procedure followed to determine the existence of hetero- 
geneity in the variances of pupil scores from school to school is de- 
scribed in the following paragraphs. 

From the schools that administered the 1938 Basic Skills tests in 
the seventh grade, all those that tested twenty or more pupils were 
selected. From each of these schools, an essentially random sample 
of twenty pupils was taken by selecting the first twenty pupils in an 
alphabetized list. Then, for each of these samples of twenty pupils, 
the variance of the scores on Part I of Test B, for example, was com- 
puted. These observed variances were arranged in a distribution, 
which was compared with the theoretical distribution of variances 
that would have occurred if these samples had been random samples. 


This theoretical distribution was determined from that fact that for 
2 


random samples of size n , the sampling distribution (3) of — , in 
Co 


which s? is the sample variance and o? is the population variance, is 
identical with the sampling distribution of y? for n—1 degrees of free- 


2 
dom. Therefore, s? is distributed as < y? . Thus, the theoretical dis- 


tribution is based on the hypothesis of homogeneity in the variances 


of pupil scores from school to school. 
The limits of the variance intervals in the distribution shown in 
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Table I were found by first estimating the population variance, o’, 
2 
in = y° . This was done by estimating the population variance from 


each sample (or school) and then finding the mean of all these esti- 
mates. The value of y? (for n—1 degrees of freedom) as given in the 


2 
7? table for each percentage level was then multiplied by — to obtain 


the lower limits of the corresponding interval. 

Thus, in Table I, a variance value of 87.6 in the theoretical dis- 
tribution of variances corresponds to the value of y? at the five per 
cent level given in the table for y?. The number of observed variances 
falling in each of these intervals was then tabulated, and these fre- 


TABLE I 


Distribution of Variances of Scores of Seventh Grade Pupils on a Vocabulary 
Test (Part I, Test B, of 1988 Iowa Every-Pupil Tests of Basic Skills) in 89 
Schools (20 pupils from each school) Compared with Distribution Expected on 


Hypothesis of Random Sampling 


Variances fe s hi fo—fs 
105.2 & above 3 01 0.89 
97.9 — 105.1 2 01 0.89 | 
87.6 — 97.8 5} 12 03 2.67} 8.90 (+) 3.10 
19.1— 87.5 2| 05 4.45 
69.5— 79.0 8 10 8.90 (—) 0.90 
63.0 — 69.4 10 10 8.90 (+) 1.10 
53.3— 62.9 18 20 17.80 (+) 0.20 
44.6— 53.2 10 20 17.80 (—) 7.80 
39.9— 44.5 4 10 8.90 (—) 4.90 
33.9— 39.8 14 10 8.90 (+) 5.10 
29.4— 33.8 3 05 4.45 , 
24.9— 29.3 6 08 2.67 
29.2— 24.8 0/13 01 0.89} 8.90 (+) 4.10 
below 22.2 4 01 0.89 
89 1.00 89.00 


x? test of goodness of fit: x?—= 12.236 
05< P< .10 

Neyman-Pearson test of homogeneity of variance: 
Normal deviate equivalent of \ = 4.78 


quencies are given in the f, column of the table. The theoretical rela- 
tive frequencies are given in the f’; column. These theoretical rela- 
tive frequencies were then expressed, in the f; column, as theoretical 
absolute frequencies in a distribution of 89 cases (f; = kf’: , k repre- 
senting the total number of samples or schools). The goodness of fit 
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of the observed distribution to the theoretical distribution was then 
tested by the use of the y? test. The value of x? for Table I was 12.236, 
as indicated in the table. The probability of getting this large a value 
of y* under a true hypothesis is less than 0.10. However, a more 
rigid test of the hypothesis of homogeneity, as devised by Neyman 
and Pearson (4), was applied to the data. The normal deviate equiva- 
lent of the value found for this “2 —test’” gives a probability less than 
0.01. (This “A —test’” was applied only to the data, except for the 1938 
ninth-grade English correctness test, which did not have a value of 
7 large enough to give a probability of less than 0.001.) 

This same procedure was followed in analyzing the test scores 
from Test B (Parts II to VI), Test C, and Test D of the 1938 Basic 
Skills program. The essential results from the analysis of each of 
these tests are given as follows: 


Normal deviate 


P(6 degrees equivalent of 
x? of freedom) Neyman-Pearson ‘‘\”’ 
Test B (Parts II-VI) 6.787 20 << P< .50 2.52 
Test C 80.126 P<.001 
Test D 28.978 P< .001 


For the high-school data, the same procedure was followed ex- 
cept that a random sample of thirty pupils was drawn from each 
school. The essential results from the analysis of the 1938 tests in 
ninth- and twelfth-grade English correctness, American history, al- 
gebra, biology, and 1937 ninth-grade English correctness are given 
as follows: 


Normal deviate 


P(6 degrees equivalent of 
x? of freedom) Neyman-Pearson “A” 

Ninth-grade English 

Correctness 1.441 95< P< .98 
Twelfth-grade English 

Correctness 12.998 02< P< .05 3.10 
American History 44.080 P< .001 
Algebra 16.586 01< P< .02 5.52 
Biology 5.609 20 < P< .50 1.82 
Ninth-grade English 

Correctness (1937) 8.700 10< P< .20 2.72 


These tests that were analyzed represent an adequate selection 
of the tests listed at the beginning of this section. These tests are, 
moreover, representative of the kind that are often employed as cri- 
terion measures in methods experiments. The number of schools and 
pupils is large enough to make the results reliable. It is evident that 
marked heterogeneity of the variances of pupil scores on most educa- 
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tional achievement tests actually does exist. The distribution of the 
variances of the 1938 ninth-grade English correctness test scores is 
the only distribution consistent with the assumption of homogeneity 
(7 = 1.441, P > 0.95). However, it is likely that this is merely a 
chance occurrence, because the analysis of the same test from the 1937 
program exhibited a degree of heterogeneity comparable with the 
other 1938 tests (except ninth-grade English correctness). 

Because this heterogeneity does exist, the writers conducted this 
study in order to determine if the F-tests based on the M/WCIl, 
M X S/WCL1, and M/M X S variance ratios are significantly affected 
when one of the requirements, that of homogeneous variance of pupil 
scores from school to school, is not satisfied. 


Ill. TESTING THE EFFECT OF HETEROGENEOUS VARIANCE 


In order to obtain some quantitative description of the effect of 
heterogeneous variance within schools upon the validity of the tests 
of significance employed in methods experiments as described in sec- 
tion I, the writers determined the actual distribution of F values for 
a large number of “experiments” and compared this observed dis- 
tribution with a theoretical distribution which would be obtained if 
all the assumptions were satisfied. The procedure employed in doing 
this is stated in the following paragraphs: 

1. The data used were the scores made by seventh-grade pupils 
on Test A of the 1938 Iowa Every-Pupil Test of Basic Skills. All 
schools which had fifteen or more pupils taking the test were selected. 
This gave 151 schools. 

2. For each school, the variance of the first fifteen pupil scores 
from an alphabetized (essentially random) list of pupils was com- 
puted. Then a distribution of these 151 variances was made, which 
was compared with a theoretical distribution (described in section 
II) in order to determine the degree of heterogeneity among the vari- 
ances. Since the degree of heterogeneity that was found did not ex- 
ceed that of some of the distributions of the variances for the tests 
indicated in section II, forty-seven schools with near-average vari- 
ances were discarded. The variances of the remaining 104 schools 
still did not exhibit a sufficiently marked degree of heterogeneity for 
the purpose of this study. Therefore some of the variances were arti- 
ficially increased or decreased in order to give a degree of marked 
heterogeneity that would rarely, if ever, be found in actual situations. 
This alteration was done for some schools by merely increasing or 
decreasing some of the pupil scores (when the pupil scores were later 
assigned at random to the methods, new random numbers were given 
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to these altered scores so as to avoid biased inclusion of high or low 
scores in any of the methods groups). The distribution of observed 
variances for these 104 schools is shown in Table II. The reader will 
note that most of the differences (f. — f;) in this table are larger 
than the differences in Table I. The value of y? indicated in Table II 
is larger than those indicated for most of the tests listed in section II. 
Also, the normal deviate equivalent of the Neyman-Pearson test of 
homogeneity of variance (“4 — test”) is larger than ary indicated for 
the tests listed in section II. Thus, if it can be shown that this unusual- 
ly large degree of heterogeneity of variance does not seriously affect 
the F-tests of significance involved in this study, then, it is not likely 
that a lesser degree of heterogeneity, as exhibited by the actual data 


presented in section II, will disturb these F-tests. 


TABLE II 


Distribution of Variances of Scores of Seventh Grade Pupils on a Comprehensive 
Test of Achievement in Silent Reading Comprehension (Test A of 1938 Iowa 
Every-Pupil Tests of Basic Skills) in 104 Schools (15 pupils from each school) 
Compared with Distribution Expected on Hypothesis of Random Sampling 


Variances fe f’, fe fo—f; 
328.1 & above 5 01 1.04, 
302.6 — 328.0 2| 01 1.04 
266.7 — 302.5 a 03 mre (+) 7.60 
237.2 — 266.6 6 05 5.20! 
204.4 — 237.1 11 10 10.40 (+) 0.60 
182.6 — 204.3 8 10. 10.40 (—) 2.40 
150.2 — 182.5 11 .20 20.%4 (—) 9.80 
121.8 — 150.1 11 20 20.c0 (—) 9.80 
106.6 — 121.7 9 10 10.40 (—) 1.40 
87.7 — 106.5 14 10 10.40 (+) 3.60 
74.0 — 87.6 “f 05 5.20) 
60.4 — 73.9 4 03 3.12 | 
52.5 — 60.3 5| 22 01 "1 came (+) 11.60 
below 52.5 6 01 1.04 | 
104 1.00 104.00 
x? test of goodness of fit: x?—= 29.749 
P < 0.001 


Neyman-Pearson test of homogeneity of variance: 
Normal deviate equivalent of \ = 7.50 


3. By using random numbers, the fifteen pupil scores in each 
school were divided into three random groups of five scores each. The 
five pupil scores corresponding to the five highest random numbers 
constituted Method A of School I, the next five scores corresponding 
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to the next five highest random numbers constituted method B of 
School I, and the remaining five scores made up Method C of School 
I. Each of these three groups of scores was considered, then, as com- 
posed of scores made by five pupils on a criterion test after having 
been taught by a certain method (Methods A, B, or C). (Actually 
there were no real differences in method. Since the pupils were all 
taught alike, any differences between methods means from group to 
group would be due only to chance.) Similarly, this was done for 
each of the 104 schools by punching random numbers on the pupil’s 
score card and making the random selections on the Hollerith sort- 
ing machine. 

4. In order to obtain 1000 sets of four schools each, random 
combinations of four schools were secured from the 104 schools. This 
was done by assigning random numbers on cards to each of the 104 
schools and making the random selections on the Hollerith sorting 
machine. The first run gave twenty-six sets of four schools each. Each 
run of the cards through the machine gave a different set of twenty- 
six random combinations. After each run, however, new random num- 
bers were assigned to the 104 schools, thus insuring that each set of 
four schools was an independently derived random set. This was con- 
tined until 1000 random sets of four schools each had been secured. 
The probability would be very small that any one of these sets would 
contain the same four schools as some other set. Each of these sets 
was then considered as an “experiment” to be analyzed as described 
in section I. Thus, each “experiment” consisted of four schools and 
three methods (A, B, and C), with a total of sixty pupils, twenty pu- 
pils under each method. 

5. By following the technique of analysis of variance, the vari- 
ances were computed for methods (M), methods by schools (M X S), 
and within-classes (WCl). (There are two degrees of freedom for 
methods, six for methods by schools, and forty-eight for within- 
classes.) 

6. Then the variance ratios, M X S/WCl, M/M X S, and 
M/WCL, were calculated for each “experiment.” Thus 1000 F val- 
ues were computed for each variance ratio. Now, since there are no 
real methods differences, these F' values would be distributed as indi- 
cated in the table (5) for F, unless these F-tests are affected by the 
heterogeneity in the variances within schools. If these F values were 
distributed as indicated in the table for F, it would be expected that 
150, or fifteen per cent, of the 1000 F values for each variance ratio 
would fall between the twenty per cent and five per cent levels of sig- 
nificance; 40, or four per cent, would fall between the five per cent 
and one per cent levels of significance; etc. The points of significance 
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corresponding to these percentage levels are found in the F table. 
For example, the twenty per cent point of significance for the F-test 
based on the M/M X S variance ratio with two and six degrees of 
freedom (Table III) is 2.13, the five per cent point is 5.14, ete. It 
would be expected, then, that 150 of these 1000 F values would fall 
between 2.13 and 5.14. Thus, for each variance ratio, the interval 
limits of a theoretical distribution were found as given in the F table, 
and then the observed F values were tabulated with reference to these 
intervals. The number of observed F values within each interval was 
then compared with the expected number to determine the actual 
deviation from expectation. These comparisons are presented in 
Tables III, IV, and V. 


kev. © 
TABLE III 
F(M/M X S, 2 and 6 degrees of freedom) fo fr 
10.92 & above 15 10 
5.14 — 10.91 49 40 
2.13— 5.13 151 150 
below 2.13 785 800 
1000 1000 
x? = 4.813 , 
Dare 2. 


If all the assumptions underlying this F-test (M/M X S) were 
satisfied, twenty per cent, or not more than 200, of the 1000 F values 
would be expected to exceed 2.13. At the five per cent level, not more 
than 50 would be expected to exceed 5.14. At the one per cent level, 
10 values of F would be expected to exceed 10.92. Yet, in spite of the 
marked heterogeneity in the variances within schools, only 215 val- 
ues of F exceeded 2.13, only 64 exceeded 5.14, and only 15 exceeded 
10.92. Thus, the observed frequencies (f.) compare very favorably 
with the expected frequencies (f;). The value of y? in testing the 
goodness of fit of the observed distribution to the theoretical distribu- 
tion for three degrees of freedom is not significant. 


TABLE IV 
F'(M/WCl, 2 and 48 degrees of freedom) f, f, 

5.08 & above 27 10 

3.19 — 5.07 62 40 

1.67 — 3.18 212 150 

below 1.67 699 800 

1000 1000 

x? = 79.380 


P< 0.001 
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TABLE V 
F(M X S/WCl, 6 and 48 degrees of freedom) te fi 

13.20 & above 21 10 
2.30 — 18.19 89 40 
141— 2.29 195 150 
below 1.41 695 800 
1000 1000 

x? = 99.406 

P< 0.001 


The effect of heterogeneity in the variances of pupil scores from 
school to school upon the variance ratios, M/WCl (Table IV) and 
M X S/WCl (Table V), is to cause a greater divergence from expec- 
tation than is seen with the M/M X S variance ratio. With a less sig- 
nificant degree of heterogeneity in the variances, as will usually be 
found (see section II), the amount of divergence here observed is not 
likely to occur. Although the two values of y? in Tables IV and V are 
highly significant, the numerical divergence of observed F' values from 
the expected number in each interval is not unduly large. 

Although it is realized that the results for the M/WCl and 
M X S/WCI variance ratios are not entirely satisfactory, these re- 
sults seem to justify the following conclusions: 

1. Methods differences may be evaluated by the M/M X S vari- 
ance ratio with confidence. Judging from the results of this study, 
the validity of the F-test based on this ratio is not seriously affected 
by heterogeneity in the variances of pupil scores from school to school. 

2. The use of the within-classes variance as the “error estimate” 
in evaluating the methods differences is not conclusively vindicated 
by the results of this study, yet for practical purposes in actual meth- 
ods experiments, the F-test based on the M/WCI variance ratio can 
be used with some assurance. 

3. Although the results from the M X S/WCIl variance ratio 
showed the greatest amount of divergence, the differences were not of 
such large absolute magnitude as to invalidate completely the F-test 
based upon this ratio. The values obtained from this test of signifi- 
cance will still approximate the values in the F table. 
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CONCERNING THE DETERMINATION OF 
TRAIT VARIABILITY 


MALCOLM G. PRESTON 
University of Pennsylvania 


An analytic technique for the study of trait variability is pre- 
sented. An expression for the average variance from test to test 
and an expression for the variance of these variances are derived 
in terms of the number of tests and the intercorrelations between 
them, and limiting cases are examined. The question of the true re- 
lationship between the nature of the distribution of test scores in a 
sample of N persons and the nature of the distribution of n traits in 
a single individual is discussed, and other problems are introduced. 


The study of variation among traits within the individual has 
recently been summarized by Anastasi (1). Among the statistical 
methods available for attack on this problem which she cites are those 
used by Hull (2), who calculated the standard deviations of the distri- 
butions of 35 tests for each of 107 subjects, after each of the test dis- 
tributions had been reduced to standard form, and those used by 
deVoss (3), who studied distributions of trait differences to determine 
whether the differences in standard scores of test scores among gifted 
children exceeded what might be expected on the basis of chance. A 
third method suggested by Anastasi involves the inspection of matrices 
of intercorrelations in order to see whether in general high standing on 
one test presupposes high standing in others. According to Anastasi, 
“if various abilities are specific and mutually independent, so that an 
individual’s standing in one tells us nothing about his relative stand- 
ing in another, we should expect the correlation between such abilities 
to be zero or very low.” It is the object of the present investigation to 
develop a precise technique for the study of trait variability resting 
upon an analytic basis and using the coefficient of product moment 
correlation as the crucial indicator. 

For the ith individual the variance of the distribution of scores 
on n tests (the distributions being normal), when the score on the 
jth test is referred to the mean performance of N individuals on that 
test as origin and with unit deviation, is: 


Vi=2[(M—a,)* + (M,—a;x)?+ (M;—a;p)?+ (M;—a;,)?+ ae | ; (1) 
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where V; is the variance for the ith person, M; is the mean of the n 
tests for the ith person, and a@i;, Gx, @ip, @ig are test scores made 
by the ith person on the jth, kth, pth, and qth tests. 

Summing (1) for m tests in the case of the ith person, we have 


V; == [ey + aa + ip + Q? ig +... — nM?;] ‘ (2) 
Over a range of N individuals the mean of the distribution of 


V; will be 


a 


Nn 





> (a?;; + @a + A ip + Aig Se nM?;) 9 (3) 
in which V is the mean of the distribution of V. Remembering that 


1 
quantities of the order V = a*;;—=1, (3) reduces to 


1 
os —_—- — . 2. 
V¥=ji v >i M;. (4) 
To Evaluate = M?; we note that 


__ Uj +A, + Dip + Dig +--- (5) 
— ’ 


M; 
n 





from which: 


1 
M?; = =,[%i; + 7x + A? jp + O?iqg +--+ + 2Aij Aix + 20; ;Qip 
(6) 
+ 20; jig + 2AinGip + Wiig + 2WipMig +--+). 





tt Lae ’ 
But quantities of the order > i jQir give a ; since by 


n? 
definition a;; and a; are specified with mean at zero and with unit 
deviation. Hence from (4) we have 

ee 1 1 

V=1 a [> rn , (7) 
in which each of the intercorrelations 7, (excepting those of the form 
r;;, Which do not appear) appear twice. 

An examination of expression (7) suggests the conclusion that the 

variability from trait to trait in the ith person will be, on the average, 
zero when all the intercorrelations among the n tests are unity. For 
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n tests there will be n(m — 1) such intercorrelations. If all are equal 
yeh : : 
to unity, we have 1 — = [n(n —1)], which is obviously zero. 


On the other hand, if the n(m — 1) intercorrelations are each zero, 
(7) obviously reduces to 1—1/n which, as n becomes indefinitely 
large, approaches 1 as a limit. Since a@;; is defined with unit deviation 
we can conclude in the case under examination that the square root 
of the variance of the person with average dispersion around the mean 
of his traits will approach the standard deviation of the distribution 
of the test scores as a limit as the number of tests increases. To con- 
sider the case of negative coefficients, it must be borne in mind that 
the minimum average value of a symmetrical matrix of coefficients of 


, where n is 





correlation has been shown by Thomson (4) to be 


the number of tests used. Taking the limit, then we have }7;,=0, 
tk 
from which in this case V yields, at the limit, 1. 
The variance of the distribution of variances is given as 





Pe - 
Veo= a dlls vr, (8) 
from which we have by expansion and simplification 
1 2 
Vy ieee *; ao i. 
N — V i 
To evaluate this quantity, we note from (2) that 
1 1 o,;, We ai , a; 4 
— >V?;=— wel Sd. Rt. a ned. Oe 
ye niGeietsts + m.), (10) 
which, upon expansion and substitution of es = eee for M;, be- 


comes a polynomial in 4[n* + 2n? + 38n? + 2n] terms, of which 7 are 
(nm — 1)? 


of the form = a*;; with coefficient ; (n? — n) are of the form 


>» a; ix , with coefficient — hol [n—1]; (m?— 7) are of the form 


ical _ , 2(m—1)? 
= a*;; ai, , of which half have coefficients ae the other half 


exhibiting 4/n*; (n*? — 8n? + 2n) are of the form Di; Gi, ip , half 


—4(n — 1) 


n* 


having coefficient , the remainder showing 8/n‘ ; and 
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4[n* — 6n* + 11n? — 6n] are of the form > ij Giz ip Gig , each having 


the coefficient 8/n‘ , the entire polynomial being subject to multiplica- 
tion by 1/N. 
: 1 

We may take the expression W = Qi; Vix Dip Aig as the general 
expression, noting that in view of the biquadratic character of the 
polynomial, no product moments of degree higher than four will be 
encountered. This expression implies the evaluation of the multiple 
integral giving the normal correlation surface in four variables. 
The evaluation of expressions of this kind is reported by Isserlis 
(5, 6) among others. For > Wij Vik Vip Viq , Isserlis gives rj Tq + 
T ip Tq + Viq Tkp . Since any two or more variables may be made identi- 
cal, a variety of results may be shown, a fact which leads to the follow- 
ing statement of relations (translated into the notation of this paper 
from a table originally given by Isserlis) : 


1 a 
We ees 


1 
— > i; Vix, = 37 jy, 
i 


N 
1 2 oo ! 2 
ve a?,; @,=1 + 2r*,, 
(11) 


D Vij Vix Vip = Tep + Aix Nip 
i 


2|- 2 _ 


—> Qj; Qix Qin Qig = Vik Vnaq + YT; Vkaq + Vjq Tkp « 
i 


Substituting the appropriate values from (11) into the fourth-degree 
ae 
product moments of the expansion V > V?; , after troublesome alge- 


bra we have: 


1 ae: 
wer 





oda | 2(n + 1) 8 bg 
ne? = ne 2 Mik +o] — 2n + 3] 
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Beta [S (2-8) [retin t rate t tintin | +S 
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Returning to (9), we have from (7) 
2 1 1 2 
= =—)|1--—--—; ik | 13 
V an ak > rn (13) 
Upon expansion and collection of terms this expression becomes: 


2 (m-1)? 2[n—1] 
n> 


v ne 





DVrint (2/n*) S 17x 
itk izk 


(14) 
+ (8/n*) [rik Tip + ix Tig + °°? J) 
+ (8/n*) [tin Tq + ip Teg + Vig +e). 
Taking the difference between (12) and (14), we have: 
y= aot + 2 (nt — 20 + 2) Brn — asso 
(15) 


X (rie Vip + ie Nig +++) F 16/4 (TK Mpg + Nip Neg + Pia Vo + °°) 

As was the case with V, V, may be examined for instances where 
in the first place all the 7’s are +1, in the second place 0. If all 7’s are 
+1, then V, gives zero by substitution of +1 for each value of r 
appearing in (15). If all 7’s are zero then V, obviously reduces to 
(2n — 2) /n?, ie., the standard deviation of the distribution of V 
approaches zero with increase in the number of tests used (remember- 
ing that } 7; includes n(n — 1) terms, > 7?;, includes n(n — 1) terms, 
(Tic Tip + Tix Viq t-:+) includes n(n —1)(n— 2) terms and (7jx 1pq 
+1 jp Tkq + iq Tx +-++) includes (n* — 6n? + 11n? — 6n) /8 terms. 

The use of formula (15) in particular is predicated upon the fact 
that each of the distributions of test scores is normal in form. This 
assumption is forced upon the formula by the assumptions underlying 
the evaluation of 5i@;; 0 @i,@;,. The determination of V is of course 
quite simple, involving only the addition of the columns of a correla- 
tional matrix (excluding elements on the principal diagonal and ob- 
serving the signs) and treating with the reciprocal and the reciprocal 
squared of the number of tests. In the case of formula (15), the values 
needed are (a) the number of tests, (b) the sum of the elements of 
a correlational matrix (not including those on the principle diagonal), 
(c) the sum of the squares of the same elements, (d) the sum of the 
products of all pairs of coefficients of correlation (excluding those on 
the principal diagonal) selected such that two subscripts are alike, 
the remaining two unlike, and (e) the sum of the products of all pairs 
selected such that the four subscripts are all different. In both of the 
latter instances it is important that the individual coefficients not be 








280 PSYCHOMETRIKA 


duplicated (as they are in the case of the sum of the first and second 
powers of 7). 

Apart from the fact that formulas (7) and (15) imply a con- 
siderable saving of time as against the method used by Hull, for ex- 
ample, once a matrix of correlations is available, the formulas have 
additional significance. 

As was shown in connection with (7), if all 7’s are zero, then the 
variance of the person represented by the mean of the variances is 
exactly equal to 1, provided n be sufficiently large. Since the metric 
of the experimental performances is universal for all performances 
(all being defined with mean at zero and with unit deviation) we may 
conclude that in the case of zero correlation the central tendency 
among N individuals in respect of trait variability will be the vari- 
ability of the population of N people. Those with trait variability less 
than 1 will of course exhibit less variability than does the population, 
those with trait variability in excess of one will exhibit more vari- 
ability than does the population. But the case where the 7’s are zero 
is the case exploited by factor theory, in which the test scores are 
contemplated as transformed to orthogonal coordinates with new 
values giving rise to zero intercorrelations. Factor theory accepted 
(no matter what its variety so long as it contemplates an orthogonal 
transformation), the measure of trait variability is directly dependent 
upon the definition of the traits. On the other hand, if traits are de- 
fined on the basis of positive intercorrelations, a definite tendency 
exists on the average for the trait variability to be less than the popu- 
lation variability. It can be seen then that it is logically impossible 
(and hence practically impossible) to give a unique answer to the 
question of how variable the individual is within himself unless the 
concept of a trait be given definition. Once the definition is given, if it 
depend upon the nature of the intercorrelations existing between test 
scores, then at least two parameters of the distribution of variances 
within the individual are immediately determined. 

This situation raises the very interesting question as to the true 
relationship between the nature of the distribution of test scores in a 
sample of N individuals and the nature of the distribution of n traits 
in a single individual. We have deduced that certain relationships 
exist in the distribution of traits by the assumption of a normal 
universe of test scores represented by the sample of N people. We 
might well have started with the assumption of a normal universe of 
traits in one individual, defined a person as a set of scores giving 
zero correlation with another set of the same number and hence es- 
tablished deduciively important conditions affecting the distributions 
of N test scores. This consideration leads to the conclusion that noth- 
ing within the formulation of the two deductive attacks permits us to 
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conclude which of the two distributions we may regard as resting on 
an empirical basis and which on a deductive basis. Unquestionably 
the settling of this issue would represent an important contribution to 
the satisfaction of the widely-recognized need for more satisfactory 
definitions in the psychology of ability and temperament. Without a 
solution at hand, however, it is evident from the preceding argument 
that if a psychological trait be defined on the basis of correlations be- 
tween test scores, then the mean trait variance and the variance of the 
variances are fixed by the definition. Since they have no mathemati- 
cal freedom they are scarcely a problem suitable for experimental in- 
vestigation. 

The form of the distribution of V; is of course an important 
question to be solved. We can by no means assume it to be normal 
although later investigation may indeed show it to be related to the 
normal form. In the absence of explicit knowledge as to its precise 
form, we are not completely helpless, however, in attempting to say 
how many of a given distribution of people exhibit trait variability to 
an extent less than that exhibited by the population. The Tchebycheff 
Criterion, as well as closer inequalities enumerated by Rietz (7), comes 
to mind immediately as available for the determination of this ques- 
tion. An additional question concerning the form of the distribution 
left untouched by this paper is the question as to the form of the dis- 
tribution of which V; is the variance. Hull (2) was of the opinion that 
the distribution was normal, on the basis of a direct examination of 
the test scores. It is the opinion of the present writer that the question 
of the normality of the trait distribution may be mathematically re- 
lated to the question of the distribution of the test scores in the popu- 
lation, and may therefore be not an experimental question at all but 
again a question which may be determined once the trait is defined. 
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A METHOD FOR APPROXIMATING THE AVERAGE INTER- 
CORRELATION COEFFICIENT BY CORRELATING 
THE PARTS WITH THE SUM OF THE PARTS 


MILTON BABITZ AND NOEL KEYS 
University of California 


It is noted that the average inter-item correlation, which rep- 
resents the internal consistency of a test, yields a unique estimate of 
test reliability. A close approximation to this average is given by 
a formula which requires the correlation of each item with the total 
score and the standard deviation of each item. The formula is es- 
pecially useful in those instances where the number of items is small 
and where the variation in item sigmas should not be neglected. 


When a test consists of a small number of items, to estimate its 
reliability by correlating split-halves and using the Spearman-Brown 
formula is crude and undependable. The situation is slightly improved 
when the items are so divided as to give approximately equal means 
and standard deviations for the two halves. Even this method is 
questionable, however, since different item groupings will cause con- 
siderable variation in the reliabilities obtained. A third procedure 
would be to correlate each test item with every other item and calcu- 
late the average of these intercorrelations. This value would represent 
the internal consistency of the test and would yield a unique estimate 
of test reliability since it would be entirely unaffected by variations in 
grouping. The practical difficulty in the way of this method is the 
great number of correlations which it occasions. For example, a seven- 
item test would require twenty-one intercorrelations, a ten-item test 
forty-five, and so on. This is unwarrantably laborious where the pur- 
pose is merely to estimate the reliability of the test. 

A new formula derived below represents an attempt to arrive at 
a close approximation to this same average inter-item correlation 
much more simply, by correlating the scores made on each test item 
with the total test score. This method is to be preferred to that of 
Edgerton and Toops* whenever it is desired to obtain an index of 
discrimination for the separate test items, using the total score as 
criterion. 


* Edgerton, H. A., and Toops, H. A. A formula for finding the average inter- 
correlation coefficient for unranked raw scores without solving any of the individ- 
ual intercorrelations. J. Educ. Psychol., 1928, 19, 181-138. 
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DERIVATION OF THE FORMULA 


In the following derivation, the small letters a, b, --- , n, represent 
the scores on items A, B, ---, N as measured from the means M,, M;, 
---, My; 8s has a similar value for the total score; o with appropriate 
subscripts represents the standard deviation of a particular distribu- 
tion; and 7, the Pearson product-moment coefficient of correlation. 
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Multiplying by o, and transposing the terms o., 0, ---,on, 
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Rearranging the right side of the equation, 
os (Tas + oe +°* + Tas) — (ea + om +++ + on) 
=a(Tay +++ + an) + oy (Tay + +++ + Ton) +++ 
+ on(Tan + Ton + +++) (9) 


Multiplying the numerator and denominator of the right side by 
(0, * op + **** Gn), it becomes 








(av +--+ + Tan) (Ta ers > Ton) 
Oa + Ob **° on — 
( ° Ne (oc: 1- +--+ + on) 
. Cot tmF) (10) 
(02+ on° *** * On-1) 


If we use the arithmetic mean of the products of (1-o- --- + on), 
(oa: 1+ +++ + on), +++, ANd (og: 0y+ *** -on-1:1) in place of the individ- 


ual products themselves, and call the average product Pes , the 
product (o,-0,- ++: -on) being designated P,, then 
os (Tas + Tos Feet Tas) Se (oa + Ob Feet on) 


oo he [27a + 2r ac ot ai ce = a. oi Ee + 2P in-1) 0] ’ (11) 


n-1 





Pes 
[os (Tas + re +++ + Tne) — (oa + op +--+ + on) op 
a [av + Vac ++: + Nan Fives + fea) . (12) 
Since the right-hand side represents the sum of the inter-item correla- 
tions, if we divide both sides by C.” (the number of combinations of 
n(n—1) 
2 


comes the average inter-item correlation. Therefore the average inter- 
item correlation, or 7;;, approximates 


Py -1 
2C."P, 


n things taken two at a time, or ) then the right side be- 


[oe (os + Tre te*+ + Tas) — (oa + op ++** ton) ]. 
(13) 


Or, substituting for C.", 7;; becomes approximately 


Ps 


n(n—1)P,. [os (as + re +--+ + ns) — (on + oy +++ + on) ] (14) 
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EMPIRICAL VALIDATION 


The use of the formula may be illustrated by applying it to data 
obtained from tests given to sixty subjects in a recent study.* Since 
this test consisted of seven items, »=7. The calculations are sum- 
marized in the following table: 


Test item o T (item) (sum) 
A 3.62 .68 
B 2.74 40 
C 3.09 80 
D 1.93 04 
E 2.24 42 
F 3.37 .67 
G 1.50 59 


For the sum of the item scores, « = 10.00. 

op = (3.62) (2.74) (3.09) (1.93) (2.24) (3.37) (1.50) = 669.80. 

To obtain the value P,-,, the following products must be found and 
averaged (or it is simpler to perform the indicated divisions) : 


1. G20 Tc0se0; = — = 446.53 
Og 

2 TaVp OcT I eT — r= 198.75 
Of 
op 

3. Cap Oc0do fo, — — — 299.02 
Ce 

4, Og0h TcTed Fg — be? = 347.05 
Od 
op 

5. On04 CITT fF, — — 216.76 
Cc 

6. Cnc TIF eT fF, — aoe = 244.45 
Co) 

(fe O40 c Ci eV fo, — es = 185.03 
Oa 

1937.59 


_ * Babitz, Milton. Measuring student ability in the application of scientific 
principles. Unpublished seminar study, University of California, 1939. 
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_ 1937.59 
7 


Substituting the values in the formula, the expression becomes: 
Average item correlation 


276.80 
~~ (669.80) (42) 


Pos = 276.80 . 


[10(.68 + .40 + .80 + .34 + .42 





+ .67 + .59) — 18.49] 
_ 276.80 
~~ (669.80) (42) 


When the twenty-one intercorrelations were computed, the aver- 
age correlation coefficient was found to be .201. 


[20.51] = .202 . 





DERIVATION OF CORRECTION FACTOR 


A correction for the approximation made in step (11) may be 
calculated as follows 
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Taking the ratio of the true to the approximate values (i.e., (6) x 


after factoring and cancelling the sigma terms from the denominators, 
the ratio becomes 
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If a, b, c, ---, m measure the same function, we may assume that any 
one sum of 7’s shown is roughly equivalent to any other such sum. 
This assumption seems logical as applying to data from homogeneous 
test materials of the type employed in the study cited here.* In that 


case (17) becomes 
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the correction factor. Applying this correction factor to the equation 
for 7 it» 


a 


*.. =< ae —_ eee f+ ie F 
Pit (R)n(n—1)P, [os (Tas + Tbs + + ns) (0. + o + o )] 


(20) 





Hence the percentage error in the uncorrected 7;; is 
(1—R) X100. (21) 


By substituting the average inter-item correlation or reliability 
in the Spearman-Brown prophecy formula, letting n equal the number 
of items, the reliability of the test is approximated.. An improved 
value can be obtained through the use of the Kuder-Richardsonj 
formula which takes account of the variation in item sigmas, eliminat- 
ing an error which the Spearman-Brown formula introduces. 


* Babitz, Milton. Measuring student ability in the application of scientific 
principles. Unpublished seminar study, University of California, 1939. 

+ Kuder, G. F. and Richardson, M. W. The theory of the estimation of test 
reliability. Psychometrika, 1937, 2, 151-160. 
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A MATRIX MULTIPLIER 


LEDYARD R. TUCKER 
UNIVERSITY OF CHICAGO 


A machine to expedite matrix multiplication has been developed 
by modifying the International Business Machines Corporation scor- 
ing machine. The principles and operation of the machine are de- 
scribed, and time and accuracy estimates are indicated. 


With the growing application of matrix algebra to statistical and 
factorial analyses, there is an increasing demand for a machine that 
will multiply matrices. This problem was submitted to the Inter- 
national Business Machines Corporation, and after several confer- 
ences with their engineers an application of the scoring machine was 
worked out. This application involved the addition of several new 
parts to the machine. Such a machine has now been built and is in 
use in the laboratory of Dr. L. L. Thurstone. 

The problem in matrix multiplication is the obtaining of a sum 
of products. This may be illustrated by the following equations in 
which the a’s and b’s are known and the c’s are to be calculated: 


4,6, + Qy2b. + Ai3b3 = C1, 
1b, + Acobe + Ao3b3 = Ce , 

















316, + Asoby + A33b3= C3 , (1) 
41D, + Aggde + Aygb, = Cy | 
which in matrix form are: 

| G1 ae ths | b, | C1 

| day tha May || || By - Ce (2) 

| Gs: Ase Ass || bs | Cs 

Ye ie es | . | C, 
A B= C¢ 


Table 1 presents a numerical example of matrix multiplication. 
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TABLE I 
1 2 3 
1] 6 —87 26 2614~C«(6 1 || 63 | 
I | 
2 | 62 40 —18 2 || —25 2 || 26 | 
| | a | 
8 5T 12 38 3 || 60 3 || .60 | 
| {| | 
a} —t1 as) lls 4 || —34 
A B = Cc 


There is a c for each row of a’s and a b for each column. The number 
of rows and columns of a’s may be greater than shown in this illustra- 
tion. 
Often the number of rows is much greater than the number of 
columns, as in factor analysis where the a’s may be the loadings of 
many tests on several orthogonal reference axes, or in multiple re- 
gression prediction where the a’s may be the scores of many indi- 
viduals on several independent variables. In factor analysis the load- 
ings on the rotated axes are desired. A column of b’s then gives the 
direction cosines of one of the new axes and the c’s are the loadings 
on this axis. There is a column of b’s and one of c’s for each rotated 
axis. In multiple regression prediction, the b’s are the beta weights 
to be applied to the independent variables in obtaining the predicted 
scores of the dependent variable. The predicted scores are the c’s. 
Again, there may be several columns of b’s and of c’s when there are 
several dependent variables to be predicted from the same independent 
variables. In each of these cases when there are several columns of 
b’s, the calculations indicated in equation (2) are to be repeated for 
each column of b’s. 

The matrix multiplier uses an electric circuit in which some of 
the connections are made by special marks on record sheets and others 
by plug wires in a plugboard. The plan is to have a record sheet for 
each row of a’s, and, thus, as many sheets as there are rows. The b’s 
are to be wired in the plugboard. The c’s are to be read from a meter. 
With this plan there is no limit to the number of rows of a’s, but the 
number of columns is limited to 15. In order to handle a matrix 
product in which the number of columns of a’s is greater than 15, the 
matrix must be divided into several sections of 15 columns each and 
the results from these sections added together. 

The record sheets are the regulation answer sheets for an Inter- 
national Business Machines scoring machine. The sheet for the first 
row of a’s in Table 1 is illustrated in Figure 1. There is a section of 
the sheet for each a. Each of these sections has two rows, one for 
positive numbers and one for negative numbers. Each row has two 
sections, the left section for the tens place and the right section for 
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the units place. A number is indicated in a row by making as many 
marks as will add up to the number, each mark in the left section add- 
ing ten, each mark in the right section adding one. Thus, in the ex- 
ample, the +.54 in the first column of the first row is indicated on the 
record sheet for this row by making five marks in the left, or tens, 
section of the first row of the record sheet and four marks in the 
right, or units, section of the first row. If the number had been nega- 
tive, the marks would have been made in the second, or negative, row. 
The —.37 is indicated by three marks in the tens section and seven 
marks in the units section of the negative row for the second number 
in the first row of a’s. 

Figure 2 shows the plugboard with the wiring for the illustrative 
problem indicated. There is a multiplier position for each b. The 
signs and the numerical values are plugged separately, the upper part 
of the board being used for the numerical values and the lower part 
for the signs. In plugging the numerical value of a b, a double wire 
is plugged into the two right-hand holes of that b’s multiplier posi- 
tion and into the two holes in the numerical value section of the board 
which correspond to the numerical value of the b. The wire from the 
top hole in the multiplier position is to go to the top hole of the two 
for the numerical value desired, and the wire from the bottom hole 
is to go to the bottom hole. In the example, the first b is 76 and the 
two wires from the first multiplier position are plugged into the two 
holes for 76 in the numerical section, the top wire from the multiplier 
position going to the top hole for 76, and the bottom wire going to 
the bottom hole for 76. The second b is 25 in numerical value and is 
plugged accordingly. 

The signs of the b’s are plugged in by connecting together two 
pairs of vertically adjacent holes which are below the multiplier posi- 
tions of the b’s. If a b is positive, the pair of holes in the first row 
below the multiplier position is connected to the pair of holes in the 
second row below the multiplier position. This case is shown by the 
first b in the illustration. A plus sign is printed on the board between 
the first two rows below the multiplier positions to indicate that posi- 
tive signs are plugged between these two rows. If the b is negative, 
as is the second one, the holes in the second row are plugged to the 
holes in the third row below the multiplier position. Whenever a 
multiplier position is not used, or when one of the b’s is zero, the holes 
in the third row and fourth row are connected together. This is the 
off position. 

The case when two b’s have the same numerical value is indicated 
in Figure 2 where the tenth and fifteenth b are both 81. The first one 
is plugged to the desired numerical value and the second one in parallel 
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with it by connecting its two wires into the left-hand holes in the 
multiplier position for the first one. The signs of the two can.be 
different and are plugged separately, as shown. 

The c’s are read from a meter to two places with their signs. 

But, before the machine can be used for a problem, its set-up 
must be completed by adjusting several potentiometers. These are 
adjusted by the use of record sheets which give known answers. One 
of these sheets has a plus thirty and a minus thirty marked in each 
of its sections. This sheet should give an answer of zero for any set 
of multipliers. A second sheet is selected from a series of sheets, each 
of which gives 100 times one of the multipliers. The sheet for the 
largest multiplier is used in the adjustment of the machine to the 
problem. 

The steps in using this machine for a matrix multiplication are 
indicated below: | 


1. The record sheets are made. 

2. The b’s are plugged in the plugboard. 

3. The potentiometers are adjusted for the problem. 

4. The record sheets are inserted one at a time, and 
the corresponding c’s are read from the meter. 


The matrix multiplier will then multiply a matrix of any length 
and of 15 columns by a single column matrix of 15 numbers. The ma- 
chine handles two-digit numbers which are positive or negative and 
gives the answers to two digits with their signs. Although no system- 
atic check on the accuracy of the machine has been made, the general 
run of the results obtained has indicated that the errors of the machine 
are well in the third place, so that two-place accuracy can be claimed. 
An estimate of the speed of the machine can be given from its use in 
Dr. Thurstone’s laboratory. For a problem of sixty rows and twelve 
columns of a’s, for which the answer sheets are already made, it takes 
on the average 15 minutes to plug in the multipliers, adjust the ma- 
chine, run the record sheets through the machine, read the answers, 
and check the column of answers by summation checks figured by 
hand. Making out the record sheets takes more time, about two or 
three minutes per sheet; but when they are. to be used for a number 
of columns of b’s as in the rotational problem in factor analysis, this 
time is a small proportion of the total time spent on the problem. 

The development of the matrix multiplier was made possible by 
a research grant from the Carnegie Corporation of New York, through 
the Carnegie Foundation for the Advancement of Teaching. The con- 
struction of the first matrix multiplying machine was made possible 
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by a special grant for this purpose from the Committee on Scientific 


Aids to Learning, of the National Research Council. The Carnegie 
Foundation does not assume editorial responsibility for the scientific 
publications that issue from its research grants. We appreciate these 


grants which have enabled us to develop improved labor-saving 


methods in factorial work. 
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LINEAR DEPENDENCE IN MULTIPLE CORRELATION WORK 


MERRILL ROFF 
Indiana University 


The problem in multiple correlation work of nonsense results 
attributable to linear dependence of variables, which has been dis- 
cussed by Ragnar Frisch in relation to economic data, is presented 
from the standpoint of its significance in psychological research. It is 
shown that a symmetric correlation determinant with unity in the 
diagonal cells can vanish only when there is a first-order or partial 
correlation of unity between one pair of the variables. On the basis 
of this result, it is argued that the problem should be expected to 
cause less difficulty in the field of psychology than in economics and 
that psychologists should be able to avoid the pitfall by bringing to 
bear their knowledge of the variables with which they are working. 


The problem of linear dependence among a set of correlated 
variables has received extensive treatment by workers in the field of 
multiple factor analysis, and particularly by L. L. Thurstone (1). His 
“centroid method” has been developed to permit the isolation of 
“primary factors” within a larger group of variables. A main differ- 
ence between his procedure and other methods of factor analysis is 
his use of estimations of the communalities in the diagonal cells of 
matrices to be factored, instead of unity or reliability coefficients. 
Ragnar Frisch, in his discussion of “confluence analysis” (2), deals 
with the problem of linear dependence in multiple correlation work 
and emphasizes the danger of getting nonsense results due to the 
vanishing of multiple correlation determinants, when working with 
data in the field of economics. If the determinants in both the numera- 
tor and the denominator of the formulas for either the multiple corre- 
lation coefficient or regression coefficients should not differ significant- 
ly from zero, the resulting quotient would have what Frisch has called 
“fictitious determinateness,” since it would be only a quotient of one 
chance error divided by another. His estimate of the frequency of 
occurrence of this situation is given in the statement that “In practice 
these cases are apt to arrive much more frequently than is usually rec- 
ognized. As a matter of fact I believe that a substantial part of the 
regression and correlation analyses which have been made on economic 
data in recent years is nonsense for this very reason.” (2, p. 6). 
Frisch’s monograph is primarily concerned with methods of detecting 
the occurrence of this situation, and he outlines a “tilling” technique, 
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which is essentially a method of systematic expansion of multiple 
correlation determinants. 

The formulas for multiple correlation and regression coefficients 
are, respectively, (3) 





A 
1s ..ce oe 1 
/ Bas ( ) 
A, 
Bre.24... n= = ’ (2) 


where 4 represents the determinant of the first-order correlation co- 
efficients among 7 variables, with unity in the diagonal cells, and Ap, 
is the first minor of the element denoted by the subscripts. 

As far as Frisch’s problem is concerned, the symmetric determi- 
nant 4,,, which occurs in the denominator of each formula, is the 
only one of the above determinants which need interest us at all, 
since it is the only one which can cause trouble if it is near zero. If 
A= 0 and A,, ¥ 0, the only result would be that 71.23... = 1, which 
is definite enough. If 4,. = 0 and A,, # 0, we would have simply 
Byz.34...n = 0, which is certainly definite and is not unusual. Conse- 
quently our attention can be concentrated on the symmetric correla- 
tion determinant with unity in the diagonal cells which occurs in each 
of the above denominators. 

The purpose of the present paper is to show that a symmetric 
correlation determinant with unity in the diagonal cells can vanish 
only when there is a first-order or partial correlation of unity between 
one pair of the variables. 

The case where two of a group of variables have a first-order 
correlation of unity is essentially trivial in this connection and will 
not receive attention again. 

Proof of the above proposition will first be given for three vari- 
ables. If we take the determinant, 


1 Ti2 T13 
Ti2 1 Yes 
T33 To3 7 | 


treat one of the coefficients, say 72;, as an unknown, set the determi- 
nant equal to zero, and solve for the unknown value, we can determine 
the pair of values which r.; would have in terms of the other coeffi- 
cients if the determinant were to vanish. Such a procedure yields the 
familiar expression 
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T23 = TieN is = V2" 13" — N12? — T1397 +1. 


There are two and only two values of 723 which will permit the fore- 
going determinant to vanish. 


If we take the formula for the partial correlation, 712.3 , 
A2 a Ti2 — T131 23 
VAuV Ace v1 —F 7h oa To3* 


treat 72; as an unknown in order to find what value it would have to 
have in terms of the other coefficients for the expression to equal 


unity, the result is again 





112.3 = 








1o3 =P iet is F V1 127113" — 12? Nis +1. 


Here again there are two and only two values which will permit the 
partial coefficient to equal unity, and they are identical with the values 
which would permit the previous determinant to vanish. 
For an extension of the proof to a larger number of one the 
work will simplify if we deal with k,.;..., instead of 71.23... 
where 
A 


Was ...2=>-= (1 —1712) (1 — 7713.2) (1 — 1714.23) -*> 
11 


x (1 iia. . (n-1)) . 


It is apparent that the determinant 4, in the numerator, can equal 
zero if and only if one of the partial correlation coefficients in the 
right-hand term equals unity. The denominator, 4,,, would be the 
numerator in the formula for the coefficient of alienation for the pre- 
diction of any variable by all the remaining variables, with the first 
variable eliminated. Thus, 


An 


ke i. 
‘i A (11) (22) 
= (1 — 7723) (1 — 124.3) (1 —7705.34) «++ (1 — Ponsa... (n-1)) » 
or 
Ks 24 Ne. eo Ar 


A (11) (38) 
= (1 — 752) (1 — 154.2) (1 — 1735.04) «++ (1 — Pence... cn-ay) » 
and so on. Here again it is apparent that 4, can vanish only if there 


is a partial correlation of unity between some pair of the variables, 
and thus that the denominator of a multiple correlation or regression 
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coefficient will equal zero only if there is a partial correlation of unity 
between some pair of the variables. 

Concerning the significance of this result, first of all it can be 
said that in most fields of psychology it is almost completely impossible 
for an adequately trained worker to get an unexpected partial corre- 
lation of unity between two of a group of variables, so that the prob- 
lem should be much less likely to cause difficulty than in work with 
economic data. In the second place, it permits us to bring to bear what- 
ever empirical knowledge we have of our variables, in order to esti- 
mate the likelihood of occurrence of this particular problem. For ex- 
ample, the vanishing of such determinants when the intercorrelations 
are obtained in the study of human intellectual abilities is almost im- 
possible in the absence of two or more identical and completely reli- 
able tests. In the absence of a simple statistical test either for the 
presence of a partial correlation of unity or for the vanishing of a 
determinant of empirically obtained correlation coefficients, psycho- 
logical workers should be able to avoid this particular statistical pit- 
fall by the use of their knowledge of the relations between the vari- 
ables with which they are working. 

Finally, the result above may be of some assistance in deciding 
the relative merits of the use of communalities and of other entries in 
the diagonal cells of a matrix of correlation coefficients which are to 
be treated by factor analysis methods. 
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CONTRIBUTIONS TO THE MATHEMATICAL THEORY OF 
HUMAN RELATIONS. IV 
OUTLINE OF A MATHEMATICAL THEORY OF INDIVIDUAL FREEDOM 


N. RASHEVSKY 
THE UNIVERSITY OF CHICAGO 


An attempt is made to define and treat analytically the concept 
of individual freedom in a society. Two possible definitions are brief- 
ly discussed. One takes as a measure of freedom the ratio 
(w, — w)w,, where w, is the maximum amount of work that a per- 


son can physically perform per unit time and w is the amount of 
work which he has actually to perform per unit time in a given 
society. The other definition takes as a measure of freedom that 
fraction of an individual’s time during which he can indulge in any 
activity of his own choice without interfering with other individ- 
uals. Expressions are derived by way of illustration, giving the in- 
dividual freedom in terms of other parameters which characterize 


the social structure. 


In several previous publications (1, 2, 3, 4, 5), we have outlined 
possible applications of mathematical methods to social phenomena 
and descriptions of social concepts in mathematical terms. An impor- 
tant question which is frequently discussed in sociological literature 
is that of individual freedom. The concept of freedom seems at first 
glance to be particularly elusive and refractory to a mathematical 
approach. In this paper we shall illustrate by means of a few simple 
examples how such an approach can be made in principle. 

Quantitative definitions of freedom have been attempted before. 
P. Sorokin (6) defines it as the ratio of the sum total of the means 
to satisfy our desires to the sum total of the desires. He does not 
however, make any quantitative applications of this definition. It 
seems to be more advantageous to use a less general definition of free- 
dom, defining it differently in its different aspects. In the present 
paper we shall consider two such aspects, without implying that these 
are the only two possible ones. 


I 
We may first consider “freedom” from the point of view which 
perhaps is best termed “economic.” Suppose an individual can per- 
form physically a maximum amount w, of work per unit time. In 
general, he will perform on the average a lesser amount of work w 
per unit time, w being determined by his requirements for the neces- 
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sities of daily life. We may then define the “economic” freedom F’, of 

an individual by the expression 

Wo — ae 
“es 





F.= (1) 


The less work a person has to perform to keep himself alive, the freer 
he is. According to our definition, his freedom is zero when the amount 
of work which he has to perform is the maximum physically possible. 

The necessary work w is in general determined by the social 
structure of the group of which the individual is a part, and therefore 
his freedom is determined also by that social structure. 

Let us for instance consider the situation discussed in a previous 
paper (4; hereinafter referred to as H R II), in which one class (I) 
organizes and directs the work of another class (II). As we have seen 
there, the condition for the possibility of existence of Class II is (H R 
II, equation 11): 


S.=e+6w>0, (2) 


where «, is the accumulation per unit time of goods produced by an 
individual of class II without the direction of class I and @ is the frac- 
tion of goods given to him from the excess which he produces under 
the direction of class I. From equation (2) we have 








= Se es 
w= 7 ‘ (3) 
Introducing equation (3) into (1) we find: 
Pitt mca, (4) 
Wo 


For a given “price” 6 (H RII), the economic freedom of an indi- 
vidual of class II decreases with increasing amount of goods, S., 
which he accumulates per unit time and decreases also with the 
amount, «,, which measures the ability of the individual of class II 
to produce goods without the organizing direction of the individuals 
of class I. The less an individual of class II is capable of producing 
himself, without the organizing direction of others, the less free he is. 

If we suppose as in H R II, that the individuals of class I fix the 
value @ so as to make their gain a maximum, then @ is not arbitrary 
but is given by the equation (16) of H RII, namely, 


i.) [oF(n) 
= 4 (5) 
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where b is a constant, and f(7) a function of the ratio of population 
of the two classes, specified more exactly in H R II. Furthermore, in 
this case S. is nothing else than the expression in brackets in equa- 
tion (18) of HR II; 


S.— @&=—b+ Vbwf(n) . (6) 


Introducing expressions (5) and (6) into (4), we find: 


b 
e—.|—_——- - 7 
j \) wof (7) (7) 


We may now study, for instance, the dependence of F. on vari- 
ous parameters, such as the total population, etc. The quantity f(7) 
measures essentially the amount of goods produced for constant 
amount of labor expended. But this amount of goods will depend on 
other things too, for instance, on the amount of raw material present 
per person, which will be inversely proportional to N = N, + Nz. 
Thus f(y) © 1/N. Similarly the constant b will depend in some 
way on N, so that b = b(N). The study of the personal economic 
freedom in terms of population density can thus be made. It will, 
however, involve first the study of the dependence of f(7) and of b 
on that density. 

If, instead of the situation considered in H R II, we take the some- 
what different situation discussed in another paper (5), referred to 
as H R III, we would obtain a different expression for F.. In all of 
the three situations discussed in H R III, we could also express the 
economic freedom F’,, of individuals of class I, as well as the freedom 
F.. of the individuals of class II. For in all these cases studied in H R 
III we can express the amount of work w, and w. given by individuals 
of classes I and II, respectively, in terms of the corresponding wy: 
and wo. . Thus we have a way of expressing the individual “economic” 
freedom for the different social situations assumed. 


II 


We shall now consider a different aspect of individual freedom, 
not involving economic relations directly. 

Let any person in a social group have the possibility of perform- 
ing either one or several of the n different activities A, , A.,---, An. 
Furthermore, let a given individual like some m activities out of those 
nm and dislike the other n — m. There are altogether 

m=n n! 


M=2 malin m)t 7 
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ways in which different activities may be liked by the individuals. For 
n! 


these are —_—————- ways of choosing m activities from n , and m 
mi(n—m)! 


itself may vary from 0 to n. If the “liking” of different choices is 
distributed at random, then altogether the fraction 1/M of all the in- 
dividuals will like a particular choice of m activities and dislike all 
others. 

Consider, in a population of N individuals, one who has that par- 
ticular choice of activities. That individual comes into social contact 
with a certain number of his fellow individuals per unit time. The 
frequency with which he comes into such a social contact is propor- 
tional to the total number N of individuals per unit area, and may be 


expressed as 
aN , (9) 


where a is a constant, depending on different external conditions, such 
as ways of communication, etc. (1). Let the average duration of the 
contact between two individuals bez. Then the fraction of time which 
a person spends in contact with others is 


t-=aN, (10) 


while the fraction of time which he has entirely to himself and which 
may be called “free” is given by 


t;=1—a:N. (11) 


During the fraction of time ¢ the individual may indulge entirely 
in the activity of his own choice, since such an activity does not inter- 
fere with that of anyone else. 

On the other hand, during the fraction of time t, he can indulge 
unrestrictedly in the activity of his own choice only when he meets 
individuals who enjoy the same activities. For, while he is in contact 
with other individuals who have different tastes, some of his m activ- 
ities may interfere with theirs and he must therefore restrict himself 
in that respect. But, as we have seen, altogether 1/M individuals 
choose the same activities. If we again assume a random distribution 
of the preferences amongst the individuals with whom the given in- 
dividuals comes into contact, we find that from the fraction ¢, of the 
time he spends with them, a fraction t./M may be spent in indulging 
in the activities of his own choice. We may now define as the freedom 
F of the individual that fraction of his total time during which he 
is free to do what he wants to do. We then find 





or 


n- 
ull 


ct 
1e 
r- 


— 


Fr oe 





N. RASHEVSKY 303 


é, M-1 
F=t,+ a 1 7; 
The freedom of an individual thus decreases with increasing N . 

We may consider a more complex case, by introducing interme- 
diate situations. Suppose the given individual, who chooses in partic- 
ular activities A,, A.,-::, Am-+, Am is in contact with another indi- 
vidual, who chooses another set of m’ activities A’, , A’s,-+++, A’m:. Let 
m” of those activities be common to both. We may then say that the 
first individual has to restrict himself to the amount m”/m, being 
free to the amount (m — m")/m. In that case, in order to obtain the 
expression for F' , we should add to expression (12) another term of a 
rather complex structure, obtained by summation of all values (m” — 
m)m taken over all possible combinations of choices which have com- 
mon elements with a given one. In this way F' becomes also a func- 
tion of the particular choice of activities which an individual makes, 
and therefore F' will vary from individual to individual. 

Still more complex situations may be studied if we consider some 
distribution function for the preferred choices, so that certain pref- 
erences occur more frequently than others. This leads to rather inter- 
esting mathematical problems. Further studies will open many inter- 
esting possibilities. 

The illustrations above show that even such a “purely sociologi- 
cal” concept as that of individual freedom may be made the subject 
of an exact mathematical treatment. 


ax . (12) 
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SOME REMARKS ON THE KUDER-RICHARDSON 
RELIABILITY COEFFICIENT 


PAUL L. DRESSEL 
MICHIGAN STATE COLLEGE 


The Kuder-Richardson reliability coefficient is derived in a man- 
ner independent of that originally given. Various alternative forms 
applicable to special situations are exhibited with the purpose of 
making them available to others interested in using this formula. 
A simplification in computation is suggested for use with a calculat- 
ing machine. 


In a paper published in 1937, Kuder and Richardson showed that 


n n 
ieee = Didi + = Vii Didi 
w1 =1 


(1) 


+48 





J 
o:? 


where 7;; is the reliability of a test, o,;? is the variance of the test 


scores,  piq; is the sum of the item variances (p; being the propor- 


i=1 
tion of correct responses to item i and qi = 1 — pi), and > ri: pigi is 
i=1 
the sum of the products of item reliabilities by item variances.* A 
number of modifications of the above formula were derived and dis- 
cussed, among them being 


| | QO: 
n Ot = Pidi 


was 1 of? 


Later the authors found this formula (2) to be the most suitable for 
general use. 

The present article is written for the purpose of presenting addi- 
tional facts about this formula. After deriving the formula in an en- 
tirely independent manner which throws additional light upon its 
Meaning, several alternative forms, useful for particular situations, 
will be indicated. 

_ * Kuder, G. F., and Richardson, M. W. The theory of the estimation of test 
reliability. Psychometrika, 1987, 2, 151-160. 
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Consider a test made up of n items and assume that in scoring the 
test the credit given for an item is 1 or 0 according as the response igs 
correct or incorrect. If p; is the proportion of correct responses to 
item i and gq; = 1 — p;, the variance of item 7 is piq;. The total score 
on the test is the sum of the item scores. If the items are indepen- 
dent of each other the variance of the sum is 


n 
s2?=TVid. 
ia 


Assuming that the items are independent is equivalent to assuming 
that each measures a different quality. If, as is ordinarily the case in 
a test, the items are intercorrelated, the total variance is given by 


Tis VPiGiD 9; = = Didi + 2D Vis VDiGiD 5G; - 


i<j 


o? => 

i=1 j= 

In an actual test most of the items will be positively intercorrelated, 
so that in most cases 


Tri VPI; > 0, 

i<j 
and hence o;” > s;?. In case the items are mutually independent, o;? 
reduces to s;*, and if this case be included we may write o;? = s;?. 
Consider now 


r=1- $,?/o%" ° 


In case the items are completely non-homogeneous, 7 = 0. In order 
to determine the maximum value of 7, note the form of r when 3s; 
and o;” are replaced by the values given above. We have 


= Pidi 
ra ae, 
DS Vidi +2DSriy VO DI 


i=1 i<j 





It is immediately apparent that the maximum value of r is obtained 
when all the inter-item correlations are equal to 1 and when, in addi- 
tion, all the item variances are equal. In this case 
npq 1 n—-1 

nd ‘ 
n(n—1) n n 
—— Pq 


r=1- 








npq +2 


If now we take 
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Didi 
Lae... aay _ #1 
ao ie o;? ), (3) 


it follows from the above results that, for all practical purposes, 7¢+ 
varies between 0 and 1. This result, as is indicated by the notation, 
is identical with the Kuder-Richardson formula given above. 

Before summarizing the information obtained by the new deri- 
vation, it will be well to examine more carefully the lower limit of 7%; . 
It sometimes happens that tests contain items which correlate nega- 
tively with most of the other items in the test. If there are many 








such items or if the test is very short it is possible that = Tis VPiGi DiGi 


t<) 
will be negative and hence that 7;; will also be negative. It is evident 
that 7;, may be 0 for a similar reason. 

We have demonstrated that, apart from the Kuder-Richardson 
derivation as an approximation to the ordinary reliability coefficient 
obtained by correlation techniques, formula (3) has a very definite 
interpretation. It measures the homogeneity of the items in a test, 
having the value 1 if the items are perfectly intercorrelated with equal 
variances and the value 0 if the items are mutually independent or 
if a number of the items are negatively discriminating. Perfect ho- 
mogeneity of items means simply that every item measures the same 
quality and therefore the same quality as is measured by the test as 
a whole. From this it is seen that 7;, may be regarded as a sort of 
composite measure of item “validity.” This property is not unique 
to this formula, but is, as has been shown by Richardson,* a property 
of the ordinary reliability coefficient. 

The Spearman-Brown formula has wide usage for predicting the 
reliability of a test similar in every way to a given test but & times as 
long. It is not obvious that the formula applies when reliability is 
computed by the method under discussion, but it is easily demon- 
strable that this is the case. The reliability of a test consisting of kn 
items is 





nk 
ee nk Ai 2 Pid 
ee Se ee : 
nk —1 = Se SR 
d PGi + 2D i; VDiGiV 9; 
t=1 


If k sets of n items each are equivalent in every way, this becomes 


‘ ear M. W. Notes on the rationale of item analysis. Psychometrika, 
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k > vidi 
tt = Pa | 1— nin 
nk—1 ad kn(kn-1) 2 Sens 
. hE Pid + oe Sry V DiGiD 9; 











n(n—1) i<j 
i : (n—1) > Didi 
“geoph n D eas 
a (n—1) S pigs + 2(kn—-1) Srij ViVi 9; 
$=2 i<j 
all : 2(nk—1) =i VDiGid}9; 
=a ; ; ai 
me * | (m-1) SD pigs + 20(kn-1) 3 rij VODA) 


i=1 i<j 
From the formula for 7;; , we find that 
° ——— ,6;°(R—1 
2D 75 VDiViDA; = Ete AL , 
i<j n 
and by substituting this result in the last equation, we find 
Wh 1. 
a nk agin n aa des kre 
ee nk — 1 n—1 ~ T+ (I) Tr 


(n—1)«;? + n(k-1) o;? T tt 











This result is the Spearman-Brown formula. 

We now turn to the consideration of some alternative forms of 
1::. Suppose that N subjects have taken a test and that according to 
the item analysis x; people gave the correct response to the ith of the 
nm items. Let y; denote the actual score obtained by the jth subject. 








Then 
N " 2 
_ % ak v5 pig = y;? p> y;? 
aie ae q=1-7» = rm j=1 
N N : 
If these values are substituted in (3), that formula becomes 
N3ai-Sa2? 
n i=1 i=1 
ae NV WV | (4) 
Nv? - (=u) 
j=1 j=1 


Since sums and sums of squares can be run off simultaneously on a 
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computing machine, the computation of 7;; is considerably expedited 
by use of this form. It should be noted that omitted questions have 
been treated as wrong responses and hence this formula should be 
used only when total scores on a test are also computed in that way. 

It frequently happens that some items of a test are weighted in 
determining a total score. The variance of total scores, o;?, auto- 
matically incorporates this weighting and some corrective factor must 
be incorporated in © pig; . If item 7 is weighted by the factor a;, the 


i=l 


variance of item 7 must be a;7piq; , and hence 


" ;? Didi 
‘= — [1-2 aa (5) 








n-1 = 


Ct 

When the total score on a test is computed as some linear com- 
bination of rights and wrongs, another modification of the formula is 
convenient. If a right response to item 7 is weighted by the factor a; , 
a wrong response by the factor b; , and if omitted questions are scored 
as wrong, then the variance of item 7 is (a; — bi)? pig; . The complete 


formula is 
> (a; — bi)? Digi 
pice je 6 
mee -_ = 7 


~—% 





or 

If failure to respond to a given item is not counted as a wrong re- 
sponse and is ignored in scoring, the formula becomes somewhat more 
complicated. As before, let »; denote the proportion of correct re- 
sponses to item 7, and let P’; denote the proportion of wrong re- 
sponses. Thus if N; respond correctly to the item, N’; respond incor- 
rectly, and N — N; — N’; omit it, 


pi= Ni/N, pi= N'/N, 


G=—1-D, qi=1-p'. 
For this situation the formula becomes 





2D ai7pidi +S 6i7p'iq'i— 2 ore : 
i=1 i=1 i=1 
Be eg 1- on 


Occasionally scores on a test are reported as a per cent of the 
maximum possible number of correct responses rather than as the 
actual number of correct responses. Thus if y; represents the actual 
number of correct responses by individual i and n is the maximum 
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; 100y; ‘ 
number, the score in per cent would be — . If the variance of 


scores is computed from these percentages, a correction must be in- 
troduced. Let o, denote the variance computed in this manner. Then 


10° S pigi 
we) n i=1 . 
ieee ES. i 1—- n? o% 


The variations of the Kuder-Richardson formula which have 
been presented represent only those which have been found to be of 
use by people utilizing this method of computing reliability. They 
are presented here with the hope of aiding and encouraging the use 
of this method rather than because any great amount of work is rep- 
resented in the derivation of them. Undoubtedly many other useful 
variations exist and can be derived as easily as those given here. 
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Rashevsky, N. Advances and Applications of Mathematical Biology. Chicago:: 
Univ. Chicago Press, 1940. $2.00. 


A REVIEW 


This book is an account of the work of Dr. Rashevsky and his associates in 
the mathematical analysis of certain problems in the fields of both psychology 
and physiology. In the case of both the sciences, the use of analysis of more com- 
plexity than mere description is quite small at present. This situation, which 
exists for various reasons which need not be discussed here, obviously must be 
changed if extensive progress is to be made. No doubt one of the principal fac- 
tors leading to such a change will be the accumulation of a body of mathematical 
analysis of demonstrated worth. Dr. Rashevsky and his associates appear to have 
assumed the task of building up such a body of analysis. 

The task is not easy, not only because the data at hand are almost certain 
to be lacking in certain essential points, but also because the presentation must 
be kept as elementary as possible in order that a sufficient number of experi- 
menters will understand it well enough to apply the tests and to make the ob- 
servations that are so necessary for keeping the analysis proceeding along fruit- 
ful lines. Since Dr. Rashevsky’s presentation is exceedingly clear, no reader with 
elementary algebra and a knowledge of the meaning of integration and of a dif- 
ferential equation will experience any difficulty in following the trend of the 
arguments. 

The book begins with the development by approximation methods of for- 
mulae for diffusion of solutes into and out of cells of various shapes. These for- 
mulae are used in deriving expressions for the oxygen’ consumption of metaboliz- 
ing cells as a function of the oxygen tension in the environment. The resulting 
expressions are shown to represent closely the metabolic data from several types 
of cells and aggregates. Considerations of diffusion forces and surface tension 
are then applied to the problem of cell division. Pertinent data here are scanty 
but the right order of magnitude for the critical radius at which spheroidal cells 
will elongate is predicted. In addition, equations derived for the elongation of 
dividing cells with time in a constant environment correspond quite closely to 
the curves based on actual data. The remainder of the first half of the book has 
a chapter on each of the following subjects: growth in relation to metabolism, cel- 
lular forms and movements, and protoplasmic streaming. Unfortunately there 
are no data to test the results, but not only are the methods given for attacking 
the problems valuable in themselves, but there is particular interest in the fact 
that the phenomena concerned can be explained on the basis of diffusion forces. 

The latter half of the book is devoted to electrical excitation and to the cen- 
tral nervous system mechanism. The basic hypotheses are that transmission is 
electrical at synapses, that action currents as stimuli may produce both states of 
excitation and inhibition (accommodation) simultaneously, that either of these 
states may predominate eventually at a synapse receiving a stream of impulses, 
depending on the excitation constants at that particular synapse, and that the 
net steady state of excitation (excitation minus inhibition) at a synapse deter- 
mines the frequency of the impulses in the efferent axon. These hypotheses and 
others which are made in particular cases are admittedly speculative. Especially 
weak, perhaps, is the assignment to the synapse of a process of accommodation 
since this process, according to available evidence, is not at all general. Never- 
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theless the ability of the analysis to predict the curves of reaction times, and 
sensory discrimination for several senses, indicates that a considerable advance 
in method at least has been made. 

In this connection it is not amiss, perhaps, to point out to the average bio- 
logical reader that the mathematical analysis of a particular problem may be 
completely correct in principle and very wrong in non-mathematical detail. It 
may even be incorrect in its postulates and still be very useful providing it is 
more fruitful eventually to think broadly about a problem, even wrongly, than 
not to think about it at all. I cannot point out any deficiency of the book in re- 
gard to principle but in regard to detail I am sure that most readers will have, 
in some cases, notions about the nature of the constituents, the forces, and the 
reactions quite different from those that are suggested. These differences of 
opinion may ordinarily, however, in no wise affect the validity of the essential 
arguments involved. I think a careful perusal of this book will offer many valu- 
able suggestions to the experimenters in the fields considered. If these sugges- 
tions are carried out much of the essential data so lacked by the theorist now 
will be provided, and progress should be greatly increased. 

H. A. BLAIR, 

University of Rochester, 
School of Medicine, 
Rochester, N. Y. 
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ABSTRACTS OF PAPERS ON THE PROGRAM ARRANGED BY 

THE PSYCHOMETRIC SOCIETY AND PRESENTED AT THE 

ANNUAL MEETING OF THE AMERICAN PSYCHOLOGICAL 

ASSOCIATION AT PENNSYLVANIA STATE COLLEGE ON 
THURSDAY, SEPTEMBER 5, 1940. 


The Validity of Personality Inventories Studied by a “Guess Who” Technique. 

CHAS. C. PETERS, Pennsylvania State College. 

The purpose of this study was to determine to what extent scores based upon 
self-testimony on the Bernreuter, the Bell, and the Link personality inventories 
agree with the behavior of the subjects as observed by others. The population was 
made up of university freshmen. Descriptions of hypothetical persons standing 
high in each of nine personality traits, and descriptions of persons standing low 
in those traits, were placed in the hands of 605 subjects, who were asked to “nomi- 
nate” for each of the 18 categories persons out of their class whom they had 
observed to be much like the persons described. From the returns “high” and 
“low” classes were determined for each of the nine traits, and for one over-all 
trait, from the persons “nominated” three or more times. These selected classes 
constituted only the extreme tails of the several distributions. The Bell, the Link, 
and the Bernreuter inventories had previously been administered to the freshmen. 
A new technique for biserial r from wide-spread classes was applied to determine 
a coefficient of correlation between the scores on the inventory and observed be- 
havior, and these r’s were tested for statistical significance by certain appropriate 
newly developed formulas. The validity correlations ranged from —.078 to +.503 
and averaged about +.26. All but two of them were highly significant statistical- 
ly. The investigation revealed as high validity correlations as other indirect evi- 
dence would suggest as probable, and proved the usefulness of a new statistical 
procedure adapted to cases when measurements must be in terms of general ob- 
servations where subjects are not known by enough observers to make the usual 
type of ratings feasible. [15 min.] 


A Theory of the Stimulus. J. A. LYNCH, Rice Institute. 
The following formula for the stimulus is submitted: 


A, = i) =a; 
i=1 
with terms defined as follows: 
A. stands for a matured learning product; 


2a; assembles all situations to which A, is relevant; and 

é#=1 

a,, a particular occasion for A,, is a function of two factors: 
where x; == total number of elements, 


y, = unordered elements, 


and x; — y; = ordered elements, 


313 








314 PSYCHOMETRIKA 


a= | y;, (x; — ¥;)- 


An element is a factor capable of manipulation as an entity emphasizing its 
symbolic efficacy. 

Accrued readiness, A,,, influences a; as an isolated member by loading its 
ordered-elements factor. 

Concrete illustrations: (1) Of a;: (i) a sentence expressing an idea in 
which the meanings of some of the words are known and some not; (ii) a jig-saw 
puzzle with some of the parts placed and some not. 


n 

(2) Of =a, : (a) aseries of sentences or word combinations expressing the same 
i=1 

idea; or (b) a series of the above-described puzzles made from the same picture. 

Different types of stimulus series are based upon the possible types of learn- 
ing products which are classified roughly as rational, volitional, and sensory. 

Every stimulus series is capable of evaluation from the standpoint of three 
criteria: (1) the learning process, (2) the learning product, and (8) pure ac- 
tivity. 

As an experiment, it is suggested that a number of segments of one of the 
series described above be rated quantitatively, on the a priori basis, as R,, R,, 
etc., comparing, 

R r ee 
— with — or with —.—, 
2 2 2 1 
T, and T., being time factors and P, and P, proficiency attained. Each practice, 
a;, should be separately motivated; and its concluding phase should be separated 
from the initial phase of a,;,,. by a diversion of interest. [15 min.] 


Analysis of Mental Growth of School Children. N. J. VAN STEENBERG, Carnegie 

Foundation. 

The purpose of this paper is to describe the normal growth of intelligence 
as defined by Stanford-Binet mental age and to compare this normal trend with 
individual growth curves superimposed on it. 

Published data from a number of sources, but principally from the Harvard 
Growth Study, have been analyzed to show the relationship between chronological 
age and Stanford-Binet mental age. It has been found that the frequency curves 
for successive CA’s are not, as expected, normal, but significantly positively skewed 
and platykurtic. It is considered important to derive a method for comparing these 
curves one with another so as to reveal norms of the growth of intelligence. 
Comparison may be carried out by two methods: (a) A rational curve with its 
parameters a function of the CA might be derived; (b) by changing the mental 
age scale into a new one by means of a nonlinear transformation a new set of 
curves approximating normality can be obtained. Both methods have been em- 
ployed, but since the concept of the normal curve is so much more readily under- 
stood by most psychologists, explanations are couched in terms of the second 
method. By means of the indices derived, a growth curve has been obtained by the 
method of absolute scaling, differing significantly from the one previously de- 
scribed by L. L. Thurstone. 

Upon these curves of moving averages there have been superimposed various 
curves based upon data from multiple observations of single individuals. These 
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curves should provide a basis for the limitation of prediction of test intelligence. 
[15 min., slides. ] 
A Criterion for the Number of Factors in a Table of Inter-correlations. CLYDE H. 

CoomBs, University of Chicago. 

The intercorrelations of tests of cognitive processes are generally positive, 
hence the test vectors lie in an n-dimensional cone or pyramid. Upon extracting a 
factor, a residual table of correlations is secured; a sign change is made before 
the next factor is extracted. This involves reflecting certain residual test vectors 
180° until they again lie in a cone or pyramid. The extent to which they lie 
mutually close together is dependent upon the presence of common factors. An 
index of this mutual dependence is given by the number of negative entries in the 
residuai matrix after sign change. The number of negative entries expected if 
only chance error factors remain depends upon the number of tests in the battery. 
A criterion based on this critical value is presented with examples of its applica- 
tion in experimental studies. [10 min.] 


On the Number of Factors. QUINN MCNEMAR, Stanford University. 

It is obvious from the literature involving the application of factorial methods 
that considerable difficulty is being experienced in assigning meaning to those 
factors extracted beyond the first few. It seems reasonable to assume that one 
cause for this predicament is the likelihood that more factors have been extracted 
than justifiable in the light of the sampling errors which affect the original 
correlational matrix. Certain of the empirical efforts to derive a criterion for the 
number of factors have been inadequate because it was wrongly assumed that 
chance sampling errors affect independently the several 7’s in a table of inter- 
correlations. 

In order to secure situations in which the number of factors was known for 
a defined universe, and in order to give free rein to the known fact that for a 
given sample the sampling fluctuations of correlation coefficients are correlated, 
resort has been made to tables of random numbers. Variables have been defined in 
terms of a predetermined number of factors plus specifics, “scores” for samples of 
from 150 to 250 have been built up, the product moment correlations calculated, 
and the resulting matrices subjected to centroid analysis. The situations used 
include the following: 9 variables, 1 factor; 10 variables, 2 factors; 10 variables, 
3 factors; 14 variables, 3 factors (one of which was general). 

Several proposed criteria for number of factors are examined in the light of 
these analyses. [15 min., slides.] 


A Factorial Study of Visual Gestalt Effects. L. L. THURSTONE, University of 

Chicago. 

This paper is a description of a program for the study of personality types 
by objective and experimental methods and more especially with a series of in- 
dividual laboratory tests of perceptual functions. In the typological literature 
there have been many suggestions of perceptual functions that are supposed to be 
diagnostic of types. Although this field has not been seriously exploited, there has 
been some experimentation, mostly in Europe. In the present study a series of 
83 perceptual measures are used, the total program requiring about five hours of 
laboratory time for each subject. The results will be analyzed factorially in the 
hope of determining some fundamental dimensions of temperament that might be 
appraised objectively and without using paper-and-pencil questionnaires. Among 
the perceptual tests are the following: the windmill illusion, the Wundt bright- 
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ness contrast illusion, brightness constancy, size constancy, six optical illusions, 
the Gottschaldt figures, Street Gestalt completion test, Schmidt’s color and form 
preference in apparent movement, after-image of movement, duration of the posi- 
tive after-image, dark adaptation time, the complication clock, peripheral span, 
flicker-fusion rate, personal tempo, apparent movement tolerance. Most of the 
tests represent visual Gestalt effects. [15 min.] 

The Isolation of Musical Abilities by Factorial Methods. J. E. KARLIN, University 

of Chicago. 

The existing evidence in the music field is discussed briefly from the point of 
view of application of factorial methods of analysis. The main body of the 
article consists of an account of an analysis of two different batteries of music 
tests by Dr. L. L. Thurstone’s multiple factor analysis technique. The trait con- 
figuration of each analysis was rotated into a promisingly intelligible simple 
structure. There appear to be, also, striking consistencies between the results of 
the two analyses indicating stability of possible music factors. The line of future 
work in this domain is indicated shortly. [15 min.] 

The Relation of Primary Mental Abilities to Preference Scales and to Vocational 
Choice. DoroTHYy C. ADKINS, Social Security Board, Federal Security Agency. 
This paper is a report of two studies, the first of which was conducted jointly 

by Dr. G. F. Kuder and the author. We were concerned with the extent to which 

one’s abilities are related to the types of activities which he prefers. 

The experimental edition of Thurstone’s Tests for Primary Mental Abilities, 
yielding scores on seven primary ability composites, was given to 512 University 
of Chicago freshmen in September, 1988. The same students filled out an experi- 
mental edition of Kuder’s Preference Record, which yields scores for nine types of 
activities. Results are presented in terms of ability profiles for contrasted groups 
on each preference scale for men and for women. In addition, Pearson intercorrela- 
tion coefficients of all measures used were obtained. The profiles and correlations 
reveal relatively slight overlapping between the measures of ability and the prefer- 
ence measures. The trends which do appear are in line with our expectations. If 
measures in each of these domains have prognostic value for certain criteria of 
success, a combination of the two sorts of measures ought to prove more effective 
than measures in either of the two fields alone. 

In the second study, the problem was to investigate the relations of primary 
mental abilities to vocational choice. The Primary Mental Abilities Tests were 
administered to male students in several departments of various universities. The 
subjects were either graduate students or seniors majoring in a given subject- 
matter field. Primary composite scores were averaged for each of the vocational 
fields. Results were plotted in terms of ability profiles for each vocational choice 
group. It is demonstrated that the ability profiles of the various vocational groups 
differ and that the differences are reasonable. [15 min., slides.] 


The Relation of Test Difficulty and Factorial Composition Determined From In- 
dividual and Group Forms of Primary Mental Abilities Tests. WiLuis C. 
SCHAEFER, University of Chicago. 


The growth of factorial studies in the cognitive field of human ability gives 
increasing evidence for the reliability of the functional unities thus determined. 
Less is known as to the conditions determining the appearance of factors, the 
validity question. This study reports an experimental investigation of the hypothe- 
sis that the perceptual component of a test is a function of the relative difficulty 
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of the task and, consequently, that the perceptual factor as defined by Thurstone’s 
tests for the Prmary Mental Abilities is of essentially different nature from that 
of the number, space, and verbal abilities defined by the same system. 

Following this hypothesis it should be possible to construct tests of various 
difficulties for each of several types of content such that the factorial description 
of this battery would show a fanning of test vectors for each series of tests 
between the perception axis and an axis X, where X represents, in turn, the num- 
ber, space, and verbal axes. 

The test battery given to 100 college men consisted of (a) 18 paper-and-pencil 
standard reference tests for the primaries, group administered; and (b) 16 ex- 
perimental tests representing six types of test material, each in several levels of 
difficulty, individually administered and scored in terms of reaction times for each 
item. This test material was projected from 35-mm. film, the subject’s responses 
being made with finger keys to enable greater experimental control over chance 
variables. Results are reported on the relation of group and individual testing 
methods for comparable test material and on the implications of the difficulty- 
hypothesis for the factorial analysis of ability. [15 min., slides.] 
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